Archive for the ‘Life & Work’ Category

A python tool for print the screen of a website to png file

Thursday, March 11th, 2010

I just found a great new tool, webkit2png. Just download the file, pull up a terminal window, and type something like:

python webkit2png http://blog.eood.cn
Then you get a full screen picture of my blog.

Wordpress Theme hot lips

Saturday, March 6th, 2010

Release my wordpress theme:

Theme Hot lips is the theme what you see here. You can download this theme and install in your blog for free.

Theme Name: Hot lips

Theme URI: http://blog.eood.cn/

Description: Hot lips WordPress theme.

Version: 1.0

Author: Bruce Dou

Tags: orange, fixed-width, two-columns

Download

Java程序优化过程及linux相关

Friday, March 5th, 2010

1.Jprofile找到程序性能的瓶颈
2.需要很长时间完成的过程由单线程转多线程或线程池。假如是IO之类的问题(普遍IO是系统的瓶颈),采用NIO即rector模式处理
3.大量小文件压缩后存入内存,定量写入硬盘。大量中间变量存入内存,或序列化压缩后存入内存,后解压反序列化调用
4.使用Queue进行不同过程的缓冲
5.Linux下普遍有打开文件个数限制,消除1024限制:ulimit -n 8192
6.JVM普遍内存限制,消除内存限制:增加运行参数 -Xms20m -Xmx200m

另:
1.如何执行jar中某个类的main方法: java -cp test.jar com.acosys.clawer.GetContent
2.如何让java程序在linux后台运行: nohup … &
3.如何查看linux后台运行的nohup程序列表: jobs
4.后台FTP上传下载工具: ncftpget ncftpput
5.如何查找linux后台程序列表 ps aux | grep …
6.强行终止linux程序: kill -9 …
7.查某文件夹下文件数目: ls -l |grep “^-”|wc -l
8.执行多个依赖库的java程序:java -cp nutch-1.0.jar:commons-logging-1.0.4.jar:hadoop-0.19.1-core.jar:xerces-2_6_2.jar org.apache.nutch.tools.DmozParser content.rdf.u8 > domz/urls

Google对信息的索引已经接近实时了

Monday, March 1st, 2010

请看最新添加的1篇博客文章的索引时间。

目前博客发表一篇文章,在1分钟之内即可在google内搜索到,也许Google对博客提高了实时性。

Java nio 实现的爬虫性能

Monday, March 1st, 2010

这两天用NIO实现的爬虫,本机测试一下。
Test result:
————————–
Transfer: 3.66 MB
Complete content: 64
Peak connection num: 50
Used 24807 ms