{"id":207,"date":"2018-04-26T15:51:36","date_gmt":"2018-04-26T15:51:36","guid":{"rendered":"http:\/\/www.sugata.in\/?p=207"},"modified":"2018-04-29T03:39:28","modified_gmt":"2018-04-29T03:39:28","slug":"python-basics","status":"publish","type":"post","link":"http:\/\/www.sugata.in\/index.php\/2018\/04\/26\/python-basics\/","title":{"rendered":"Python Basics"},"content":{"rendered":"<p>I recently learned and started using Python for some of my projects. Python is a high level programming language with a number of pre-programmed packages for a variety of useful tasks. Tasks I&#8217;ve used Python for include scraping the web for data (excellent!), machine learning (meh &#8230; but that&#8217;s more my fault than Python&#8217;s), OCR (super meh), and algorithmic name classification, such as gender determination (again, excellent!).<\/p>\n<p>While I will not provide direct code to perform predictive analysis using Python, I will use this post to link to a variety of resources that I have used, along with how I use it.<\/p>\n<p>First, how to get started with Python. I use Jupyter Notebook, along with Anaconda. Both of these are installed when you download and install the latest version of Anaconda &#8211; google &#8220;download jupyter notebook&#8221; and go to the <a href=\"http:\/\/jupyter.org\/install\">first link<\/a>. The actual download will be from the <a href=\"https:\/\/www.anaconda.com\/download\/\">Anaconda website<\/a>.\u00a0 As of posting, the latest version is Python 3.6. Click &#8220;Download,&#8221; run the file and choose all the default options and install Python and Jupyter Notebook.<\/p>\n<p>Jupyter Notebook runs inside your browser. Open up Jupyter Notebook, create a folder for coding, and then create a new Notebook. Each Notebook has distinct cells for distinct blocks of code that can be run separately. Once you run the code in a cell, the output is produced right below. Here is an example:<\/p>\n<p><a href=\"http:\/\/www.sugata.in\/wp\/wp-content\/uploads\/2018\/04\/Capture2.jpg\"><img loading=\"lazy\" class=\"alignnone size-large wp-image-208\" src=\"http:\/\/www.sugata.in\/wp\/wp-content\/uploads\/2018\/04\/Capture2-1024x856.jpg\" alt=\"Capture2\" width=\"550\" height=\"460\" srcset=\"http:\/\/www.sugata.in\/wp\/wp-content\/uploads\/2018\/04\/Capture2-1024x856.jpg 1024w, http:\/\/www.sugata.in\/wp\/wp-content\/uploads\/2018\/04\/Capture2-300x251.jpg 300w, http:\/\/www.sugata.in\/wp\/wp-content\/uploads\/2018\/04\/Capture2.jpg 1153w\" sizes=\"(max-width: 550px) 100vw, 550px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>As you can see, when you run each cell, it simply generates the output right below. One thing I wanted to point out is that variables and variable types are generated dynamically. the code &#8220;a=1&#8221; first defined a as an integer and then sets it to two. Printing (and other functions) can be applied to integers (e.g. &#8220;print(a)&#8221;) or strings (e.g. print(&#8216;hello world&#8217;)) but not to a mix (see the error in the second cell).<\/p>\n<p>The second thing (and I love this) is the indentation is part of the language.<\/p>\n<p>if 3&gt;2:<\/p>\n<p style=\"padding-left: 30px;\">print(&#8216;hi&#8217;)<br \/>\nprint(&#8216;there&#8217;)<\/p>\n<p>will return<\/p>\n<p>hi<br \/>\nthere<\/p>\n<p>if 3&lt;2:<\/p>\n<p style=\"padding-left: 30px;\">print(&#8216;hi&#8217;)<br \/>\nprint(&#8216;there&#8217;)<\/p>\n<p>will return nothing<\/p>\n<p>but<\/p>\n<p>if 3&lt;2:<\/p>\n<p style=\"padding-left: 30px;\">print(&#8216;hi&#8217;)<\/p>\n<p>print(&#8216;there&#8217;)<\/p>\n<p>will return<\/p>\n<p>there<\/p>\n<p>The indentation controls what is run in the &#8220;if&#8221; statement. This forces discipline in generating readable (and workable) code.<\/p>\n<p>Once you&#8217;ve gotten Python up and running &#8211; you&#8217;ll need additional packages to do\u00a0other code.<\/p>\n<p>For webscraping, I&#8217;d recommend selenium and chromedriver.<\/p>\n<p>For OCR, I&#8217;d recommend Tesseract (Google&#8217;s OCR).<\/p>\n<p>For machine learning, I use (but don&#8217;t know enough to recommend) tensorflow.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently learned and started using Python for some of my projects. Python is a high level programming language with a number of pre-programmed packages for a variety of useful tasks. Tasks I&#8217;ve used Python for include scraping the web for data (excellent!), machine learning (meh &#8230; but that&#8217;s more my fault than Python&#8217;s), OCR [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,9],"tags":[],"_links":{"self":[{"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/posts\/207"}],"collection":[{"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/comments?post=207"}],"version-history":[{"count":4,"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/posts\/207\/revisions"}],"predecessor-version":[{"id":212,"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/posts\/207\/revisions\/212"}],"wp:attachment":[{"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/media?parent=207"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/categories?post=207"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.sugata.in\/index.php\/wp-json\/wp\/v2\/tags?post=207"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}