Information extraction for scholarly digital libraries