Identifying content blocks from Web documents