#29 转载:脚下的流沙
美文佳句 2013-10-17这么多年来,一直是我脚下的流沙裹着我四处漂泊,它也不淹没我,它只是时不时提醒我,你没有别的选择,否则你就被风吹走了。我就这么浑浑噩噩地度过了我所有热血的岁月,被裹到东,被裹到西,连我曾经所鄙视的种子都不如。
coding in a complicated world
这么多年来,一直是我脚下的流沙裹着我四处漂泊,它也不淹没我,它只是时不时提醒我,你没有别的选择,否则你就被风吹走了。我就这么浑浑噩噩地度过了我所有热血的岁月,被裹到东,被裹到西,连我曾经所鄙视的种子都不如。
Tornado web server 是使用 Python 编写出來的一个极轻量级、高可伸缩性和非阻塞 IO 的 Web 服务器软件,著名的 Friendfeed 网站就是使用它搭建的。
Tornado 跟其他主流的 Web 服务器框架(主要是 Python 框架)不同是采用 epoll 非阻塞 IO,响应快速,可处理数千并发连接,特别适用用于实时的 Web 服务。
要使用它,必须按照以下套件:
1)Python(建议使用 Python 2.5 / Python 2.6)
2)Simplejson(建议使用 simplejson 2.0.9)
3)cURL(建议使用 curl 7.19.7 或以上版本)
4)Pycurl(建议使用 pycurl 7.16.2.1)
5)Tornado Web Server(这才是主角,版本就照官網上最新的安裝吧)
一个最简单的服务:
import tornado.ioloop
import tornado.web
class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("Hello, world")
application = tornado.web.Application([
(r"/", MainHandler),
])
if __name__ == "__main__":
application.listen(8888)
tornado.ioloop.IOLoop.instance().start()
#!/usr/bin/env python
# *-* encoding: utf-8 *-*
import os
import sys
import tornado.web
from tornado import autoreload
from tornado.wsgi import WSGIContainer
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop
from tornado.web import url
from django.conf import settings
from django.core.handlers.wsgi import WSGIHandler
if not os.path.dirname(__file__) in sys.path[:1]:
sys.path.insert(0, os.path.dirname(__file__))
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
class Application(tornado.web.Application):
def __init__(self):
handlers = [
url(r"/static/(.+)", tornado.web.StaticFileHandler, dict(path=settings.MEDIA_ROOT), name='static_path'),
url(r"/media/(.+)", tornado.web.StaticFileHandler, dict(path=settings.MEDIA_ROOT), name='media_path'),
]
handlers.append(('.*', tornado.web.FallbackHandler, dict(fallback=WSGIContainer(WSGIHandler()))))
tornado.web.Application.__init__(self, handlers)
http_server = HTTPServer(Application())
http_server.listen(8080)
loop = IOLoop.instance()
autoreload.start(loop) #自动加载修改过的代码
loop.start()
信息
成功
重定向
客户端错误
服务器错误
{
"100": "Continue",
"101": "Switching Protocol",
"102": "Processing",
"103": "Early Hints",
"200": "OK",
"201": "Created",
"202": "Accepted",
"203": "Non-Authoritative Information",
"204": "No Content",
"205": "Reset Content",
"206": "Partial Content",
"207": "Multi-Status",
"208": "Already Reported",
"226": "IM Used",
"300": "Multiple Choice",
"301": "Moved Permanently",
"302": "Found",
"303": "See Other",
"304": "Not Modified",
"307": "Temporary Redirect",
"308": "Permanent Redirect",
"400": "Bad Request",
"401": "Unauthorized",
"402": "Payment Required",
"403": "Forbidden",
"404": "Not Found",
"405": "Method Not Allowed",
"406": "Not Acceptable",
"407": "Proxy Authentication Required",
"408": "Request Timeout",
"409": "Conflict",
"410": "Gone",
"411": "Length Required",
"412": "Precondition Failed",
"413": "Payload Too Large",
"414": "URI Too Long",
"415": "Unsupported Media Type",
"416": "Range Not Satisfiable",
"417": "Expectation Failed",
"418": "I'm a teapot",
"421": "Misdirected Request",
"422": "Unprocessable Entity",
"423": "Locked",
"424": "Failed Dependency",
"425": "Too Early",
"426": "Upgrade Required",
"428": "Precondition Required",
"429": "Too Many Requests",
"431": "Request Header Fields Too Large",
"451": "Unavailable For Legal Reasons",
"500": "Internal Server Error",
"501": "Not Implemented",
"502": "Bad Gateway",
"503": "Service Unavailable",
"504": "Gateway Timeout",
"505": "HTTP Version Not Supported",
"506": "Variant Also Negotiates",
"507": "Insufficient Storage",
"508": "Loop Detected",
"510": "Not Extended",
"511": "Network Authentication Required"
}
给本站加上了 sitemap.xml 和 robots.txt。
SEO 常用工具中有两个:站点地图(sitemap)和爬虫协议禁止抓取文件(robots.txt)。
请原谅我用一个这么长的名字来称呼 robots.txt。
结构类似:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/aaa</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://www.example.com/bbb</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>weekly</changefreq>
<priority>0.7</priority>
</url>
<url>
<loc>http://www.example.com/ccc</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>
</urlset>
PS: XML 也可以使用样式,本站就有个简单的样式,可以打开 sitemap 页面看看。
就是在 urlset 前加了个样式表:<?xml-stylesheet type="text/xsl" href="sitemap.xsl"?> 样式表连接
告诉爬虫不要采集指定的路径。
不过这个其实谈不上协议,只是行业内的一个约定而已,大厂家尚且不一定严格遵守(百度诉360违反Robots协议案开庭 百度索赔1亿元)。
以下内容来自:维基百科 - 反向链接。
反向链接是指A站通过域名或锚文本指向B站,从而使网站权重得到提升。
增加反向链接方法
在论坛签名内加上网站名,发言时就会带上网站链接
创建博客,在更新的文章内加上网址
到各个相关门户投稿,并加上网站链接
购买链接,这种方法不太稳定,不建议采用
在别人博客留言
和其他网站交换友情链接反向链接作用
对于SEO而言,反向链接能够使网站获得好的排名,所以,反向链接的好坏直接影响网站的整体权重和流量。
runserver 端口占用,报错:Errno 10013
运行 django runserver 的时候,出现 Error 10013,即 8000 端口被占用。
MySQL 出现了一个问题,插入一个长字符串(二十几KB)结果导致被截断。
用 MySQLdb 操作数据库,插入数据之后发现数据库中依然为空,不知原因为何。
开启 mysqld 的 log 设置项之后发现日志文档中更有执行 sql 语句,直接复制语句在客户端中执行也没有问题,那么为什么通过 MySQLdb 的插入全部没有结果呢?
::selection 选择器
在 CodeIgniter 框架基本案例中看到了一个新型的选择器,以前没有关注到,就是两个冒号加 selection。
| Version | Release date |
|---|---|
| 1.0 | 1994-02-14 |
| 1.1 | 1994-10-09 |
| 1.2 | 1995-03-28 |
| 2.0 | 1997-10-26 |
| 2.1 | 2000-04-12 |
| 2.2 | 2001-05-23 |
| 2.3 | 2004-01-29 |
| 3.0 | 2015-05-18 |
| 目录 | CentOS 7 | Ubuntu | 含义 |
|---|---|---|---|
| / | 根目录 | ||
| /sys/ | sysfs 虚拟FS,内核信息 | ||
| /proc/ | proc 虚拟FS,进程信息 | ||
| /tmp/ | 临时文件 | ||
| /home/ | 用户主目录 | ||
| /root/ | |||
| /boot/ | 启动相关:内核 + grub | ||
| /bin/ | /usr/bin/ | /usr/bin/ | 系统程序 |
| /sbin/ | /usr/sbin/ | /usr/sbin/ | 系统程序(系统管理) |
| /lib/ | /usr/lib/ | /usr/lib/ | 动态库 |
| /lib64/ | /usr/lib64/ | /usr/lib64/ | 动态库 |
| /usr/ | |||
| /etc/ | 配置文件 | ||
| /var/ | |||
| /opt/ | |||
| /data/ | |||
| /srv/ | |||
| /dev/ | 设备 | ||
| /mnt/ | 挂载点 | ||
| /media/ | 挂载点 | ||
| /lost+found/ | - | 使用中的已删除文件 | |
| /run/ | /run/ | ||
| /snap/ | |||
| /cdrom/ |
| 目录 | 含义 |
|---|---|
| /var/log/ | |
| /var/cache/ | |
| /var/mail/ | |
| /var/run/ |
PS:较新版本中,/var/run/ 变成了 /run/ 的软链接。
UNIX Software Resource
| 目录 | 含义 |
|---|---|
| /usr/bin/ | 应用程序 |
| /usr/sbin/ | 应用程序(系统管理) |
| /usr/lib/ | |
| /usr/lib64/ | |
| /usr/local/ | |
| /usr/include/ | |
| /usr/share/ | |
| /usr/src/ |
PS:/,/usr/,/usr/local/ 三层具有类似的目录结构
PS:较新版本中 /bin/、/sbin/、/lib/、/lib64/ 和 /usr/ 目录下的同名目录合并了。
.help
.databases
.tables
.shema <tableName> # show sql
.fullschema
.quit
.mode list # 默认,没有头部,竖线隔开
.mode tabs # tab 隔开
.mode column # 可读性强些
.mode line # 像 MySQL \G
sqlite> .show
echo: off
eqp: off
explain: auto
headers: off
mode: list
nullvalue: ""
output: stdout
colseparator: "|"
rowseparator: "\n"
stats: off
width:
filename: /tmp/history.db
sqlite> .dbinfo main
database page size: 4096
write format: 2
read format: 2
reserved bytes: 0
file change counter: 3110
database page count: 6669
freelist page count: 0
schema cookie: 7
schema format: 4
default cache size: 0
autovacuum top root: 0
incremental vacuum: 0
text encoding: 1 (utf8)
user version: 2
application id: 0
software version: 3033000
number of tables: 4
number of indexes: 6
number of triggers: 0
number of views: 0
schema size: 785
data version 2
.schema tablename
pragma dbname.table_info(tablename)
pragma dbname.table_xinfo(tablename) # 包含虚拟表中的隐藏列
SELECT * FROM sqlite_master WHERE tbl_name = 'tablename';
sqlite> .help
.archive ... Manage SQL archives
.auth ON|OFF Show authorizer callbacks
.backup ?DB? FILE Backup DB (default "main") to FILE
.bail on|off Stop after hitting an error. Default OFF
.binary on|off Turn binary output on or off. Default OFF
.cd DIRECTORY Change the working directory to DIRECTORY
.changes on|off Show number of rows changed by SQL
.check GLOB Fail if output since .testcase does not match
.clone NEWDB Clone data into NEWDB from the existing database
.connection [close] [#] Open or close an auxiliary database connection
.databases List names and files of attached databases
.dbconfig ?op? ?val? List or change sqlite3_db_config() options
.dbinfo ?DB? Show status information about the database
.dump ?OBJECTS? Render database content as SQL
.echo on|off Turn command echo on or off
.eqp on|off|full|... Enable or disable automatic EXPLAIN QUERY PLAN
.excel Display the output of next command in spreadsheet
.exit ?CODE? Exit this program with return-code CODE
.expert EXPERIMENTAL. Suggest indexes for queries
.explain ?on|off|auto? Change the EXPLAIN formatting mode. Default: auto
.filectrl CMD ... Run various sqlite3_file_control() operations
.fullschema ?--indent? Show schema and the content of sqlite_stat tables
.headers on|off Turn display of headers on or off
.help ?-all? ?PATTERN? Show help text for PATTERN
.import FILE TABLE Import data from FILE into TABLE
.imposter INDEX TABLE Create imposter table TABLE on index INDEX
.indexes ?TABLE? Show names of indexes
.limit ?LIMIT? ?VAL? Display or change the value of an SQLITE_LIMIT
.lint OPTIONS Report potential schema issues.
.load FILE ?ENTRY? Load an extension library
.log FILE|off Turn logging on or off. FILE can be stderr/stdout
.mode MODE ?TABLE? Set output mode
.nonce STRING Disable safe mode for one command if the nonce matches
.nullvalue STRING Use STRING in place of NULL values
.once ?OPTIONS? ?FILE? Output for the next SQL command only to FILE
.open ?OPTIONS? ?FILE? Close existing database and reopen FILE
.output ?FILE? Send output to FILE or stdout if FILE is omitted
.parameter CMD ... Manage SQL parameter bindings
.print STRING... Print literal STRING
.progress N Invoke progress handler after every N opcodes
.prompt MAIN CONTINUE Replace the standard prompts
.quit Exit this program
.read FILE Read input from FILE
.recover Recover as much data as possible from corrupt db.
.restore ?DB? FILE Restore content of DB (default "main") from FILE
.save FILE Write in-memory database into FILE
.scanstats on|off Turn sqlite3_stmt_scanstatus() metrics on or off
.schema ?PATTERN? Show the CREATE statements matching PATTERN
.selftest ?OPTIONS? Run tests defined in the SELFTEST table
.separator COL ?ROW? Change the column and row separators
.session ?NAME? CMD ... Create or control sessions
.sha3sum ... Compute a SHA3 hash of database content
.shell CMD ARGS... Run CMD ARGS... in a system shell
.show Show the current values for various settings
.stats ?ARG? Show stats or turn stats on or off
.system CMD ARGS... Run CMD ARGS... in a system shell
.tables ?TABLE? List names of tables matching LIKE pattern TABLE
.testcase NAME Begin redirecting output to 'testcase-out.txt'
.testctrl CMD ... Run various sqlite3_test_control() operations
.timeout MS Try opening locked tables for MS milliseconds
.timer on|off Turn SQL timer on or off
.trace ?OPTIONS? Output each SQL statement as it is run
.vfsinfo ?AUX? Information about the top-level VFS
.vfslist List all available VFSes
.vfsname ?AUX? Print the name of the VFS stack
.width NUM1 NUM2 ... Set minimum column widths for columnar output