IASS Webinar 8: Building a Sample Frame of SMEs Using Patent, Search Engine, and Website Data

Map Unavailable

Date/Time
Date(s) - 02/09/2021
1:00 pm - 2:30 pm

Category(ies) No Categories


Please register for the IASS Webinar at:

https://attendee.gotowebinar.com/register/5869182490404848139

 

After registering, you will receive a confirmation email containing information about joining the webinar. There will be time for questions. The webinar will be recorded and made available on the ISI web site. See below for the abstract and biography of the speaker.

Webinar Abstract

This research outlines the process of building a sample frame of US SMEs. The method starts with a list of patenting organizations and defines the boundaries of the population and subsequent frame using free to low-cost data sources, including search engines and websites. Generating high-quality data is of key importance throughout the process of building the frame and subsequent data collection; at the same time, there is too much data to curate by hand. Consequently, we turn to machine learning and other computational methods to apply a number of data matching, filtering, and cleaning routines. The results show that it is possible to generate a sample frame of innovative SMEs with reasonable accuracy for use in subsequent research: Our method provides data for 79% of the frame. We discuss implications for future work for researchers and NSIs alike and contend that the challenges associated with big data collections require not only new skillsets but also a new mode of collaboration.

 

 

 

 

Biography of Speakers

 

Sanjay K. Arora (sanjay.k.arora@ey.com): Sanjay is an innovation policy and management researcher who uses emerging big data sources to measure small firm R&D and entrepreneurial activity.  He is also an ML Engineering Business Leader at EY, the global audit and consulting firm. Sanjay currently resides in Washington, DC.

Sarah Kelley (SKelley@childtrends.org): Sarah Kelley, MIDS, is a Senior Data Scientist at Child Trends where her work focuses on the intersection of social science and data science, especially using natural language processing and machine learning to answer social science questions.