ライブラリ自体はブラウザエミュレータ(javascriptサポートなし)であり、ブラウザの使用を必要とする(「無効な」javascriptのコーディングを使用して)クラスの問題を解決できます。 特に、承認とPDFドキュメントの保存を必要とする1つのリポジトリから膨大な数の科学記事をダウンロードする必要があるときに最初にこのライブラリにアクセスしたため、補助ツールの助けを借りずに1時間だけドキュメントをダウンロードする必要がありましたが、 WWW :: Mechanize PERLライブラリ(その機能については少し前に読みました)を覚えていませんでした。GoogleでWWW :: Mechanize pythonを入力しなかったので、ソーシングが必要になりました。
しかし、きれいな歌詞。
それはすべて、ライブラリのインストールから始まります。 ダウンロードセクションのオフサイトで詳細を読むのがベストです。 要するに、このライブラリを取得する主な方法は3つあります。
個人的には、この記事で詳しく説明するのは意味がありませんが、オフサイトでこれを行うよりも詳細に記述するのは意味がありません。
このライブラリのいくつかの機能を示すために、ハブに移動してログインしようとする小さなスクリプトを作成しました。これを実行できた場合は、最大数のコメントでメインのhubratopikを検索し、すべてを表示します。
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
# -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url
ソースコード
スクリプトを実行し、次のようなものを取得します。
$ python habratest.py Habrahabrへようこそ ログインしようとしています krigとして正常にログインしました メインの記事の検索を開始します habrakamentaなしのHabratopikov:0 habrakamentaの最大数:72 最もコメントのあるトピック:無料のコンピューター そして、次の場所にあります:http://habrahabr.ru/blogs/startup_ideas/52540/
この素晴らしいライブラリのその他の興味深い機能は、 オフサイトで見ることができます。