call - (614) 706-0966
email -

Share this:

Like this:

Data Connections Blog #5:  Mommy, where does data come from?

[i] Even Google needs more data:

[ii] Per Wikipedia:  “Synthetic data is information that is artificially generated rather than produced by real-world events. Typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models”

[iii] Data scraping (also called Web Scraping) is the process of extracting data from web pages.  Often this includes use of bots to download the data.  This is frequently seen as a benign action when used on public data.  But, when data scraping is done without a website’s permission, it can be malicious.  Once collected, the data is put into a format more useful to an end user.

[iv] Metadata is like an index or card catalog for the data.  It is the data that describes the dataset.  

[v] More on robots.txt here: 

A red and black logo

Description automatically generated

One thought on “Data Connections Blog #5:  Mommy, where does data come from?

Leave a Reply

Your email address will not be published. Required fields are marked *

Share this:

Like this: