<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>web scraping Archives -</title>
	<atom:link href="https://mitindia.in/tag/web-scraping/feed/" rel="self" type="application/rss+xml" />
	<link>https://mitindia.in/tag/web-scraping/</link>
	<description></description>
	<lastBuildDate>Sun, 13 Dec 2020 08:00:04 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://mitindia.in/wp-content/uploads/2023/03/cropped-android-chrome-512x512-1-32x32.png</url>
	<title>web scraping Archives -</title>
	<link>https://mitindia.in/tag/web-scraping/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Data Science &#8211; 1 [web scraping]</title>
		<link>https://mitindia.in/data-science-1-web-scraping/</link>
		
		<dc:creator><![CDATA[SKB]]></dc:creator>
		<pubDate>Sun, 13 Dec 2020 07:58:09 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[data science]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[web scraping]]></category>
		<guid isPermaLink="false">http://www.mitindia.in/?p=1076</guid>

					<description><![CDATA[<p>Data science Data science: is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from ocean of data. Web scraping can be a solution to speed up the data collection process. Instead of looking at the job site every day, you can use Python [&#8230;]</p>
<p>The post <a href="https://mitindia.in/data-science-1-web-scraping/">Data Science &#8211; 1 [web scraping]</a> appeared first on <a href="https://mitindia.in"></a>.</p>
]]></description>
										<content:encoded><![CDATA[<h1>Data science</h1>
<p><strong>Data science:</strong> is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from ocean of data.</p>
<p><strong>Web scraping</strong> can be a solution to speed up the data collection process.<br />
Instead of looking at the job site every day, you can use Python to help automate the repetitive parts of your job search.</p>
<ol>
<li>Web scraping example using Python [extract title, head and h1 text from a website]</li>
</ol>
<pre class="EnlighterJSRAW" data-enlighter-language="null">from bs4 import BeautifulSoup
import requests
url = "https://flipkart.com"
req = requests.get(url)
soup = BeautifulSoup(req.text, "lxml") #lxml=html/xml html.parser
print(soup.title)
print(soup.head)
print(soup.h1)

file1=open('test.txt', 'w')
file1.write(str(soup.h1))
file1.close()</pre>
<p>Output of the above program is :</p>
<p><img decoding="async" class="wp-image-1077 alignleft" src="http://www.mitindia.in/wp-content/uploads/2020/12/webscarp-300x40.png" alt="" width="443" height="59" srcset="https://mitindia.in/wp-content/uploads/2020/12/webscarp-300x40.png 300w, https://mitindia.in/wp-content/uploads/2020/12/webscarp-768x103.png 768w, https://mitindia.in/wp-content/uploads/2020/12/webscarp-1024x137.png 1024w, https://mitindia.in/wp-content/uploads/2020/12/webscarp.png 1251w" sizes="(max-width: 443px) 100vw, 443px" /></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>2. Web scrap example [check web scraping is allowed or not, if status code other than 200 then site is not allowed for scraping!]</p>
<pre class="EnlighterJSRAW" data-enlighter-language="null">import requests
from bs4 import BeautifulSoup 
r=requests.get("https://www.amazon.in")
print(r.status_code)
</pre>
<p>output:</p>
<blockquote><p>200</p></blockquote>
<p>3. Retrieve live Covid-19 data using web scraping and store in .csv format locally</p>
<pre class="EnlighterJSRAW" data-enlighter-language="null">import bs4 
import pandas as pd 
import requests 
url = 'https://www.worldometers.info/coronavirus/country/india/'
result = requests.get(url) 
soup = bs4.BeautifulSoup(result.text,'lxml') 
#search for maincounter-number class 
cases = soup.find_all('div' ,class_= 'maincounter-number') 
# to store data
data = [] 
for i in cases: 
  span = i.find('span') 
  data.append(span.string)
print('Cases', '\tDeaths', '\tRecovered')
print(data, end='\t') 

#dataframe  to visualize
df = pd.DataFrame({"CoronaData": data}) 
#creating coloumns 
df.index = ['TotalCases', ' Deaths', 'Recovered'] 
# storing into Excel 
df.to_csv('Corona_Data.csv')
</pre>
<p>Output of the above code is :</p>
<blockquote><p>[ Cases       /     Deaths / Recovered ]<br />
[&#8216;9,857,380 &#8216;, &#8216;143,055&#8217;, &#8216;9,357,464&#8217;]</p></blockquote>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><a class="a2a_button_whatsapp" href="https://www.addtoany.com/add_to/whatsapp?linkurl=https%3A%2F%2Fmitindia.in%2Fdata-science-1-web-scraping%2F&amp;linkname=Data%20Science%20%E2%80%93%201%20%5Bweb%20scraping%5D" title="WhatsApp" rel="nofollow noopener" target="_blank"></a><a class="a2a_button_facebook" href="https://www.addtoany.com/add_to/facebook?linkurl=https%3A%2F%2Fmitindia.in%2Fdata-science-1-web-scraping%2F&amp;linkname=Data%20Science%20%E2%80%93%201%20%5Bweb%20scraping%5D" title="Facebook" rel="nofollow noopener" target="_blank"></a><a class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fmitindia.in%2Fdata-science-1-web-scraping%2F&#038;title=Data%20Science%20%E2%80%93%201%20%5Bweb%20scraping%5D" data-a2a-url="https://mitindia.in/data-science-1-web-scraping/" data-a2a-title="Data Science – 1 [web scraping]"><img src="https://static.addtoany.com/buttons/favicon.png" alt="Share"></a></p><p>The post <a href="https://mitindia.in/data-science-1-web-scraping/">Data Science &#8211; 1 [web scraping]</a> appeared first on <a href="https://mitindia.in"></a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
