mechanizeを使用したWebアプリケーションのテスト

Ruby Webアプリケーションの自動テストツールであるWatirに関するhabratopikをきっかけに、Python言語の同様のツールに関する短い記事を書くことにしました。 それは素晴らしい機械化ライブラリーについてです。 Watirとは異なり、mechanizeはどのOS向けにも調整されておらず、Pythonライブラリurllibおよびurllib2のアドオンです。



ライブラリ自体はブラウザエミュレータ(javascriptサポートなし)であり、ブラウザの使用を必要とする(「無効な」javascriptのコーディングを使用して)クラスの問題を解決できます。 特に、承認とPDFドキュメントの保存を必要とする1つのリポジトリから膨大な数の科学記事をダウンロードする必要があるときに最初にこのライブラリにアクセスしたため、補助ツールの助けを借りずに1時間だけドキュメントをダウンロードする必要がありましたが、 WWW :: Mechanize PERLライブラリ(その機能については少し前に読みました)を覚えていませんでした。GoogleでWWW :: Mechanize pythonを入力しなかったので、ソーシングが必要になりました。



しかし、きれいな歌詞。



それはすべて、ライブラリのインストールから始まります。 ダウンロードセクションのオフサイトで詳細を読むのがベストです。 要するに、このライブラリを取得する主な方法は3つあります。





個人的には、この記事で詳しく説明するのは意味がありませんが、オフサイトでこれを行うよりも詳細に記述するのは意味がありません。



このライブラリのいくつかの機能を示すために、ハブに移動してログインしようとする小さなスクリプトを作成しました。これを実行できた場合は、最大数のコメントでメインのhubratopikを検索し、すべてを表示します。



  1. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  2. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  3. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  4. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  5. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  6. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  7. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  8. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  9. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  10. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  11. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  12. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  13. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  14. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  15. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  16. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  17. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  18. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  19. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  20. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  21. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  22. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  23. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  24. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  25. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  26. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  27. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  28. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  29. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  30. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  31. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  32. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  33. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  34. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  35. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  36. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  37. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  38. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  39. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  40. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  41. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  42. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  43. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  44. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  45. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  46. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  47. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  48. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  49. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  50. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  51. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  52. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  53. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  54. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url



  55. # -*- coding: utf-8 -*- import re from mechanize import Browser # , root_url = 'http://habrahabr.ru/' login_url = 'http://habrahabr.ru/login/' username = 'krig' userpass = '******' comments_re = r 'habrahabr\.ru/blog.*/#comments' count_re = re . compile ( '(\d+)\s*\+(\d+)' ) post_id_re = re . compile ( 'habrahabr\.ru/blog.*/(\d+)/#comments' ) def is_logged_in (page_text): return not login_url in page_text br = Browser() home_page = br. open (root_url) print ' ' , br.title() # , if not is_logged_in (home_page.read()): print " " br.follow_link(text= '' ) br.select_form(name= "login" ) br[ "login" ] = username br[ "password" ] = userpass result_page = br.submit() if not is_logged_in (result_page.read()): print " " exit( 1 ) else : print " " , username # max = zero_comments = 0 title = url = '' print " " br. open (root_url) # , # , , br.links(...) # for comments_url in [url for url in br.links(url_regex=comments_re)]: m = count_re.search(comments_url.text) # , if not m: zero_comments += 1 else : if int (m.group( 1 )) > max: max = int (m.group( 1 )) habrapost_id = post_id_re.search(comments_url.absolute_url).group( 1 ) post_url = [url for url in br.links(url_regex=r 'habrahabr\.ru/blog.*/' + habrapost_id + '/?$' )][0] title = post_url.text url = post_url.absolute_url print ' :' , zero_comments print ' :' , max if max: print ' :' , title print ' :' , url





ソースコード



スクリプトを実行し、次のようなものを取得します。

  $ python habratest.py 
 Habrahabrへようこそ
ログインしようとしています
 krigとして正常にログインしました
メインの記事の検索を開始します
 habrakamentaなしのHabratopikov:0
 habrakamentaの最大数:72
最もコメントのあるトピック:無料のコンピューター
そして、次の場所にあります:http://habrahabr.ru/blogs/startup_ideas/52540/ 




この素晴らしいライブラリのその他の興味深い機能は、 オフサイトで見ることができます。




All Articles