Monday, March 1, 2021

Common job Roles comes under data science.

 



Below is a general description of a few main job roles in the Data science field.

1. Data Scientist

Being a data scientist may be intellectually demanding, and analytically fulfilling, and it can put you at the cutting edge of new technological developments. As big data continues to be more crucial to how businesses make choices, data scientists are becoming more prevalent and in demand.

Data scientists decide what questions their team should be asking and then work out how to use data to respond to those questions. For forecasting and reasoning, they frequently create predictive models.

Daily duties for a data scientist could include the following:

  • Analyze databases for patterns and trends to gain new insights.
  • To predict outcomes, create algorithms and data models.
  • Utilize machine learning methods to raise the caliber of data or product offerings.
  • Inform senior staff and other teams of your recommendations.
  • Use data analysis software like Python, R, SAS, or SQL.
  • Keep up with advancements in the field of data science.

***According to Glassdoor, the average compensation for a data scientist in the United States is $122,499 asof April 2022.


 2. Data Analyst

To find the solution to a problem or provide an answer to a question, a data analyst gathers, purifies, and analyzes data sets. They work in a variety of fields, including as government, business, finance, law enforcement, and science.

The practice of extracting information from data to guide better business decisions is known as data analysis. Five iterative phases typically comprise the data analysis process:

Choose the data you want to examine:

  • Collect the data
  • Clean the data in preparation for analysis
  • Analyze the data
  • Interpret the results of the analysis

When it comes to what Data Analyst hat actually do, Here’s what many data analysts do on a day-to-day basis:

  1. Gather data: Analysts frequently do their own data collection. This can entail completing surveys, monitoring website visitor demographics, or purchasing datasets from data collection experts.
  2. Clean data: Raw data may include outliers, duplicates, or errors. In order to prevent inaccurate or distorted interpretations, cleaning the data refers to maintaining the quality of data in a spreadsheet or through a programming language.
  3. Model data: This requires developing and planning a database's structural elements. You may decide which data kinds to save and gather, how to tie different data categories to one another, and how the data will actually look.
  4. Interpret data: Finding patterns or trends in the data will enable you to interpret it and use it to support your interpretation of the question at hand.

***The average base pay for a data analyst in the United States in December 2022 is $62,382, according to job listing site Glassdoor


3. Data Engineer

Data engineering is the practice of developing large-scale data collection, storage, and analysis systems. It covers a wide range of topics and has uses in almost every business. Massive volumes of data can be gathered by organizations, but to make sure that it is in a highly useable shape by the time it reaches data scientists and analysts, they need the right personnel and the right technology.

The following are some of the most typical duties of a data engineer:

  • Architecture development, construction, testing, and maintenance
  • Align the architecture with the needs of the business
  • data gathering
  • Create data set procedures.
  • Utilize tools and programming languages
  • Determine how to increase data quality, efficiency, and reliability.
  • Make inquiries about your industry and business through research
  • Utilize vast data sets to solve business problems
  • Utilize high-end analytics software, machine learning, and statistical techniques.
  • gather information for predictive and prescriptive modelling
  • Utilize data to find hidden patterns.
  • Find tasks that can be automated using data.
  • based on analytics, provide stakeholders with updates


4. Machine Engineer

Machine learning engineers are in high demand today. However, the job profile has some difficulties. Machine learning engineers are expected to perform A/B testing, design data pipelines, and implement popular machine learning algorithms like classification, clustering, etc., apart from having a deep understanding of some of the most powerful technologies like SQL, and REST API. , etc.

A few important roles and responsibilities of a machine learning engineer include:

  • Design and development of machine learning systems
  • Exploring machine learning algorithms
  • Testing machine learning systems
  • Application/product development based on client requirements
  • Extending existing machine learning frameworks and libraries
  • Exploring and visualizing data for a better understanding
  • Training and retraining systems
  • Know the importance of statistics in machine learning

 

***According to Glassdoor the average salary for a Machine Learning Engineer is $107270 per year in US.



 5. Data Architect

A data architect creates data management plans so that databases can be easily integrated, centralized and protected with the best security measures. They also ensure that data engineers have the best tools and systems to work with.

A few important roles and responsibilities of a data architect include:

  • Development and implementation of an overall data strategy aligned with the business/organization
  • Identification of data collection sources in accordance with the data strategy
  • Collaborate with cross-functional teams and stakeholders for smooth operation of database systems
  • End-to-end data architecture planning and management
  • Maintaining database systems/architecture with efficiency and security in mind
  • Regularly audit the performance of the data management system and make changes to improve the systems accordingly.

 

Data Scientist Roles and responsibilities.

 Data scientists work closely with business stakeholders to understand their goals and determine how data can be used to achieve those goals. The design data modelling processes create algorithms and predictive models to extract the data the business needs and help analyze the data and share insights with peers. While each project is different, While each project is different, the process for gathering and analyzing data generally follows the below path:

1. Ask the right questions to begin the discovery process.

2. Acquire data.

3. Process and clean the data.

4. Integrate and store data.

5. Initial data investigation and exploratory data analysis

6. Choose one or more potential models and algorithms.

7. Apply data science techniques, such as machine learning, statistical modelling, and artificial intelligence.

8. Measure and improve results.

9. Present final result to stakeholders

10. Make adjustments based on feedback.

11. Repeat the process to solve a new problem.




 

 

 

 


Data scientist: Sexiest job in 21st century

As we are living in the Big Data Era, Data Science is becoming a very promising field to harness and process huge volumes of data generated from various sources. Data Science is a vast discipline in itself, consisting of specialized skill sets such as statistics, mathematics, programming, computer science and so on. Data science consists of several elements, techniques and theories including math, statistics, predictive analysis, data modelling, data engineering, data mining, and visualization.

In this modern era, data scientists are the super powered heroes who lead the digital world.

Who is actually a Data scientist? What do they actually do? Are they struggling with data all day and night or experimenting in his/her laboratory with complex mathematics?  

Let’s explore!

There are several definitions available on Data Scientists. In simple words, A data scientist is a professional responsible for collecting, analyzing, and interpreting extremely large amounts of data.

They’re part mathematician, part computer scientist and part trend-spotter. And, because they straddle both the business and IT worlds, they’re highly sought-after and well-paid.


They’re also a sign of the times. Data scientists weren’t on many radars a decade ago, but their sudden popularity reflects how businesses now think about 
big data. That unwieldy mass of unstructured information can no longer be ignored and forgotten. It’s a virtual gold mine that helps boost revenue – as long as there’s someone who digs in and unearths business insights that no one thought to look for before.

To take an idea will see some definition of data scientist from different popular websites.


  • Data scientists are big data wranglers, gathering and analyzing large sets of structured and unstructured data. A data scientist’s role combines computer science, statistics, and mathematics. They analyze, process, and model data then interpret the results to create actionable plans for companies and other organizations. (Masters in data science )

  • Data Scientist practices the art of Data Science. (Edureka)

  • A data scientist is a professional responsible for collecting, analyzing and interpreting extremely large amounts of data. The data scientist role is an offshoot of several traditional technical roles, including mathematician, scientist, statistician and computer professional. This job requires the use of advanced analytics technologies, including machine learning and predictive modelling.(TechTarget)


Inspiring Facts: Top 5 data scientist   (From AnalyticsInsights)

 

1. Geoffrey Hinton

Geoffrey Hilton is called the Godfather of Deep Learning in the field of data science. Mr Hinton is best known for his work on neural networks and artificial intelligence. A PhD in artificial intelligence, he is accredited for his exemplary work on neural nets.

Twitter- @geoffreyhinton

 

Awards– AM Turing (2019), BBVA Foundation Frontiers of Knowledge Award in Information and Communication Technologies (2016), IEEE Frank Rosenblatt Award (2014), IJCAI Award for Research Excellence (2005), Rumelhart Prize (2001).

 

2. Jeff Hammerbacher

 

The co-founder of the term, “Data Science”, Jeff Hammerbacher developed methods and techniques for capturing, storing, and analysing a large amount of data. Credited to start Facebook’s data science team, he threw his weight behind adopting Hadoop enabling the social media giant’s data team to process tons of data in real-time at a lightning-fast speed. Mr Hammerbacher is the co-founder at Cloudera and also been an instructor at the Icahn School of Medicine.

Twitter- @hackingdata

Book- Beautiful Data

 

3. Dhanurjay Patil

Dhananjay Patil is a former US Chief Data Scientist, and along with Jeff Hammerbacher he coined the term “data science”. A doctorate in Applied Mathematics from the University of Maryland College Park, the distinguished Dhanurjay Patil has been a principal consultant to many blue-chip companies which include LinkedIn, Skype, Salesforce, PayPal, eBay, and Greylock Partners.

Twitter- @dpatil

Awards– Medal for Distinguished Public Service

 

4. Alex “Sandy” Pentland

 

Alex “Sandy” Pentland is termed as one of the world’s seven most powerful data scientists along with Larry Page, by Tim O’Reilly in 2011. Mr Pentland also founded and leads an MIT-wide program that works actively in pioneering computational social science using Big Data and AI. A serial entrepreneur he co-leads the World Economic Forum Big Data and Personal Data initiatives and is a founding member of the Advisory Boards for Motorola Mobility, Telefonica, Nissan, and a variety of start-up firms.

Mr Pentland leads the Media Lab Entrepreneurship Program promoting companies using cutting edge technologies to solve real-world problems. Mr Pentland is also an advisor to the Enigma Project & Endor.

Twitter- @alex_pentland

Awards– McKinsey Award from Harvard Business Review, Brandeis Award, The 40th Anniversary of the Internet (from DARPA)

 

5. Dean Abbott

Founder and president of Abbott Analytics, Dean Abbott is a seasoned data science professional. With over 21 years of enriching experience, he is adept at deploying advanced and complex data mining techniques into data preparation and data visualization.

Mr Abbot is credited for his outstanding expertise in fraud detection mechanics, data and modelling, missile guidance, survey analysis, predictive toxicology, and signal processes.

Twitter- @deanabb

Books – IBM SPSS Modeler Cookbook and Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst.

 

 

Sunday, February 28, 2021

Inspiring Facts: 5 Famous Companies and brands use Data Science to improve their performances


This is a little effort to show the power behind Data Science. Let’s take a closer look at some companies and brands using such platforms to improve performance and efficiency and deliver better customer experiences.

 

#01 Walmart



  • Walmart uses data mining to discover patterns in point of sales data. Data mining helps Walmart find patterns that can be used to provide product recommendations to users based on which products were bought together or which products were bought before the purchase of a particular product.
  • A familiar example of effective data mining through association rule learning technique at Walmart is – finding that Strawberry pop-tarts sales increased by 7 times before a Hurricane. After Walmart identified this association between Hurricane and Strawberry pop-tarts through data mining, it places all the Strawberry pop-tarts at the checkouts before a hurricane. 
  • Another noted example is during Halloween, sales analysts at Walmart could look at the data in real-time and found that thought a specific cookie was popular across all Walmart stores, there were 2 stores where it was not selling at all. The situation was immediately investigated, and it was found that simple stocking oversight caused the cookies not being put on the shelves for sales. This issue was rectified immediately which prevented further loss of sales.

 #02 McDonald's



    McDonalds is another famous company that use data science to increase their performances. Their updated mobile app allows customers to order and pay almost entirely via their mobile devices. To make the experience that much more enjoyable, they gain access to exclusive deals, too. In return for the convenience, McDonald’s collects essential information about their audience. They can see what foods and services customers order, how often or even whether they visit the drive-thru or go inside. All this data allows for more targeted promotions and offers. In fact, Japanese customers using the company’s mobile app spend an average of 35 percent more because of spot-on recommendations just before they are ready to order food.


#03 Spotify



Spotify is another brand name which uses Big data for a better user experience. it uses AI and big data to deliver better playlists and streaming content recommendations to its users. The Discover Weekly feature is an excellent example of this in action. Each week, Spotify offers every user a personalized playlist with music recommendations based on their listening and browsing history. It’s kind of like a curated mixtape from the platform, offering new tracks and artists, showing you new genres you might enjoy or even updating you on your favorite music.

This feature is possible thanks to a vast trove of information and data they collect from their user base. When you have millions of people listening to music every day, you gain some pretty deep insights into user habits and preferences.

The company has also launched a “Spotify for Artists” app that lets bands and music artists see analytics related to their content.

 

#04 Amazon



The online retail giant has access to a massive amount of data on its customers; names, addresses, payments and search histories are all filed away in its data bank.

While this information is obviously put to use in advertising algorithms, Amazon also uses the information to improve customer relations, an area that many big data users overlook.

The next time you contact the Amazon help desk with a query, don't be surprised when the employee on the other end already has most of the pertinent information about you on hand. This allows for a faster, more efficient customer service experience that doesn't include having to spell out your name three times.


#05 CocaCola



The company collects data on its customers to boost current consumption and upsell new products, which has led to a more efficient operation that cuts costs and boosts profits. As consumers share their opinions of the product through social media, phone or email, it allows the company to adjust its approach and better align with consumer interests and demands. The data the company collects is aimed at improving the brand experience and developing greater customer loyalty.


Saturday, February 27, 2021

Why Data Science? & Why it’s so important?



Now we have a clear idea about what data science is and about the history of the data science. Will now explore why we need something like data science and what’s the importance of the data science to the world. 

Before that will look at why data matters this much in the current world.

Data is the electricity in the current world, Fuel to run the world. As it was mentioned in the 1st article, we are living in the age of the 4th industrial revolution. Which is the era od Artificial Intelligence and Big Data.  There is a massive data explosion that has resulted in the culmination of new technologies and smarter products. Around 2.5 exabytes of Data is created each day. The need for data has risen tremendously in the last decade.

Just think of this amount of data produced in every millisecond throughout the world! Then assume the world without data science! All the data will be just another raw material simply a rubbish which will be gathered and disposed without any usage.

Before data scientist there were statisticians who use data. These statisticians experienced in qualitative analysis of data and companies employed them to analyze their overall performance and sales. With the advent of a computing process, cloud storage, and analytical tools, the field of computer science merged with statistics.

 This gave birth to Data Science!

Data is a magic while data scientists are wizards who know how to use data in a insightful way. Data Scientist will know how to dig out meaningful information with whatever data he comes across. He helps the company in the right direction.

Summarizing Data science or data-driven science enables better decision making, predictive analysis, and pattern discovery. It lets you.

Before

·         Find the leading cause of a problem by asking the right questions.

·         Perform exploratory study on the data.

·         Model the data using various algorithms. 

·         Communicate and visualize the results via graphs, dashboards, etc.

In practice, data science is already helping the airline industry predict disruptions in travel to alleviate the pain for both airlines and passengers. With the help of data science, airlines can optimize operations in many ways, including:

·         Plan routes and decide whether to schedule direct or connecting flights.

·         Build predictive analytics models to forecast flight delays.

·         Offer personalized promotional offers based on customers booking patterns. 

Decide which class of planes to purchase for better overall performance.


Friday, February 26, 2021

Trip into the history of data science.

Data Science has revolutionized several different aspects of our world. Let's take a look then at when and where data science comes from.

·         In 1962, John W. Tukey wrote in “The Future of Data Analysis” - The first milestone in the history of data science is globally recognized for the bright American mathematician John Tukey. The influence of John Tukey in statistical terms is enormous, but the most famous coinage attributed to him is related to computer science. In fact, it should be mentioned that he was the first to introduce the term "bit" as a contraction of "binary digit."

·         In 1974, Peter Naur published the Concise Survey of Computer Methods, which surveyed data processing methods across a wide variety of applications. The term “data science” becomes clearer, as he puts his own definition on it: “The science of dealing with data, once they have been established, while the relation of the data to what they represent is delegated to other fields and sciences.”

·         In 1977, the International Association for Statistical Computing (IASC) was founded.

·         In 1989, Gregory Piatetsky-Shapiro organized and chaired the first Knowledge Discovery in Databases (KDD) workshop.

·         In 1994, BusinessWeek published a cover story on “Database Marketing.”

·         In 1996, on the occasion of the conference of the International Federation of Classification Societies (IFCS), for the first time, the term “data science” is included in the title of the conference (“Data science, classification, and related methods”). In the same year, Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth publish “From Data Mining to Knowledge Discovery in Databases.”

·         In 1997, during his inaugural lecture as the H. C. Carver Chair in Statistics at the University of Michigan, Jeff Wu called for statistics to be renamed “data science” and statisticians to be renamed “data scientists.”

 

 

Data Science: Most Blooming field in the 21st century.


    Our digital world creates a massive amount of data each and every second. Thanks to the communication devices, sensors and computations we use they capture information of great value to business and government across the globe. There is an explosion of data around us, and Data Science and Big Data are transforming our world and touching our daily lives like never before. It is believed that over 2.5 quintillion bytes (2.5 e+9 GB) Data is created every day and the number is increasing in order. 

Companies like Walmart, Adidas Entertainers like Spotify, Netflix Search engine companies such as Google, Yahoo! Leagues like NBA (National Basketball Association is a professional basketball league in North America.) USE has created an entirely new business model by capturing the information freely available on the web and providing it to people in useful ways. They collect trillions of bytes of data every day and continually add new services to their businesses to improve them.

Data science enables businesses to process huge amounts of structured and unstructured big data to detect patterns. This in turn allows companies to increase efficiencies, manage costs, identify new market opportunities, and boost their market advantage.

Asking a personal assistant like Alexa or Siri for a recommendation demands data science. So does operate a self-driving car, using a search engine that provides useful results, or talking to a chatbot for customer service. These are all real-life applications for data science.



  • Will see some of the definitions from some famous websites.

1. Definition from IBM

Data science is a multidisciplinary approach to extracting actionable insights from the large and ever-increasing volumes of data collected and created by today’s organizations. Data science encompasses preparing data for analysis and processing, performing advanced data analysis and presenting the results to reveal patterns and enable stakeholders to draw informed conclusions.

 2. Definition from Edureka

Data Science is a blend of various tools, algorithms, and machine learning principles to discover hidden patterns from the raw data.

3. Definition from Wikipedia

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from many structural and unstructured data. 

4. Definition from DataRobot

Data science is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data. Data science practitioners apply machine learning algorithms to numbers, text, images, video, audio, and more to produce artificial intelligence (AI) systems to perform tasks that ordinarily require human intelligence. In turn, these systems generate insights that analysts and business users can translate into tangible business value.

 5. Definition from ComputingForAll

Data science is a field of study that focuses on techniques and algorithms to extract knowledge from data. The area combines data mining and machine learning with data-specific domains.

6. Definition from Omni. sci

Data science enables businesses to process huge amounts of structured and unstructured big data to detect patterns. This, in turn, allows companies to increase efficiencies, manage costs, identify new market opportunities, and boost their market advantage.

  • This is a simple video that will clearly describe data science.




Rise of Big Data

When talking about Big Data, as its names suggest it is about a huge amount of data, which cannot be managed (Stored, processed…) by none of...