Insights from 400k Kickstarter & Indiegogo Campaigns – Methodology

Methodology Overview

HiveWire has collaborated with Shopify to produce a crowdfunding infographic that showcases some of the insights we have generated in order to educate and inform those interested in crowdfunding.

The analysis behind the infographic was rooted predominantly in direct primary research using publicly available information. Our analytics methodology is comprised of four basic areas: data acquisition, cleaning & validation, measurement & analysis, and also insight generation. We use a variety of tools, custom scripts, specialized software, and processes for our analytical projects. An overview of our methodology that was used in this infographic follows.

Data Acquisition

We wrote scripts to acquire publicly available crowdfunding campaign information. The first step was to acquire unique campaign URLs on Kickstarter and Indiegogo (two of the most notable rewards-based crowdfunding platforms) using crawling scripts. We found over 430,000 unique campaign URLs on these platforms in total. The second step was to acquire specific information from the campaign page using our scraping scripts to extract specific variables (e.g. campaign name, campaign goal amount, number of backers, etc.).

Data Cleaning

We initiated a data exploration step to find any irregularities, missing pieces, or ‘suspicious’ looking campaigns. Our goal was to minimize any possible non-legitimate campaigns (eg: test campaigns). As examples, we rejected campaigns that: did not have both a start and close date, did not have a campaign name, had invalid URLs, had conflicting information (e.g. had an amount pledged but with a zero backer count), etc. Campaigns that raised zero dollars were also included in the analysis. It is not that uncommon to see campaigns that don’t raise any money, as lack of crowdfunding education sometimes creates false expectations. We also cleaned legitimate campaign data as necessary, for example, removing extra or unwanted characters in variables, which were not needed or which would impede efficient post-processing or analysis.

Data Post-processing

Our post-processing steps involved the calculation or processing of a variety of variables, examples include:

  • Average backing: For each campaign we calculated the average backing (amount pledged divided by number of backers).
  • Gender identification: Identifying the gender of a campaigner using their first name and comparing that to list of names and genders.
  • Currency harmonization: All dollar figures are expressed in USD and foreign currency was converted at the appropriate exchange rate (exchange rate as of ~ August 18, 2014).
  • City aggregation: Aggregating campaign totals for specific cities required the aggregation of city boroughs, city areas, as well as aggregating various spellings of cities. Campaigns that had locations, which identified multiple cities, were not included in the city totals presented (adding these campaigns would not represent any significant changes to the data presented).
  • Length of campaign: The length of the campaign (in days) was determined from the start and close dates.
  • Average number of words on campaign page: the number of words was calculated using only the full text of the campaign about description.
  • Super categories: Although the majority of campaign categories are the same between Indiegogo and Kickstarter, it was necessary for simplicity, to create super categories to best group various categories between the two platforms. The categories that are essentially the same between the platforms are: Dance, Theatre, Music, Comics, Art, Design, Games, Food, Photography, Technology, and Fashion. The following super categories were combined from various other categories:
    • Film & Video super category contains:
      • Kickstarter category: Film & Video.
      • Indiegogo categories: Film, Transmedia, and Video/Web.
    • Writing super category contains:
      • Kickstarter categories: Journalism and Publishing.
      • Indiegogo category: Writing.
    • Lifestyle super category contains:
      • Kickstarter category: Crafts.
      • Indiegogo categories: Animals, Health, Religion, and Sports.
    • Society super category contains:
      • Indiegogo categories: Community, Education, Environment, and Politics.
    • Small Business is a category that only resides on Indiegogo and was not modified.

Data Validation

Validation is an on-going process and occurs in many steps from data acquisition script validation, validating post-processing variables, to validating analysis outputs. As an example, we selected a statistically significant, randomly generated, sample size of campaign data and performed a manual [tedious!] check (comparing our data to the actual campaign page) to ensure data validity. This step was done twice for both Kickstarter and Indiegogo. We also validated our data with any relevant data that was previously published by Kickstarter or Indiegogo.

The final dataset that was used in the analysis (after cleaning and validation) also only contained those campaigns that were completed (their timeline was expired) at the time of data acquisition. This resulted in a total dataset of 400,068 unique campaigns: 246,397 Indiegogo campaigns, and 153,671 Kickstarter campaigns.

Measurement & Analysis

The analysis is essentially a snapshot of completed crowdfunding activity on both Kickstarter and Indiegogo by ~ August 7th, 2014. This snapshot of crowdfunding activity contained 99% of completed Kickstarter campaigns and we believe also contained all of the completed Indiegogo campaigns. There was no data extrapolation needed to approximate total values for the entire population set on the platforms (with the exception noted below). Listed below are some more details associated with the data that was calculated.

  • Total cumulative pledged: This is the cumulative pledged on both Kickstarter and Indiegogo together. The values for 2011 and 2013 come directly from the data. The value for 2015 is a projected value and was calculated extrapolating historical growth data.
  • Pledged” money: The term “pledged” money includes all contributions made to a crowdfunding campaign whether the campaign was successful or not.
  • Successful campaign: A successful campaign is define here as one which has an amount pledged that is greater or equal to its goal, by the time the campaign ends.
  • Successful campaigns run by female campaigners (founders): The percentage was calculated using only those campaigns where a gender could be identified (the majority of the dataset). For Indiegogo, it was assumed that the first person listed in the ‘Team” section was the team lead and only this gender was identified.
  • Percent of VC investments going to female founders data source: Brush et al, “The Diana Project: Women Business Owners and Equity Capital: The Myths Dispelled”, Kauffman Center for Entrepreneurial Leadership, 2001.
  • City pledged per capita analysis: The number was calculated using publicly available government population sources.
  • Total web traffic: The numbers for web visits are estimations using a representative month and the data is from a secondary external source.
  • Successful campaign characteristics: this data comes from measurements across both Kickstarter and Indiegogo for only successful campaigns. Some additional info is below:
    • Most popular contribution reward level: This is the reward level range that is selected the most, where the determining factor is the count of times people select a reward range. This is different from the average pledged, which is the average money a backer gives to a campaign.
    • Average number of comments: The average is affected by outlier campaigns, to reduce this affect, the value of 17 is calculated when the top 1% of outlier campaigns (very high level of comments) is removed. The median number of comments is 4.
    • Average number of Facebook friends: This number was calculated using only those campaigns where the campaigner (campaign founder) connected Facebook to their campaign. For Indiegogo, it was assumed that the first person listed in the “Team” section was the team lead and only this number was used, if present.
    • Average length of video: The average length of a campaign video was calculated using a statistically significant sample set as well as published data.

General note: Kickstarter and Indiegogo are two similar crowdfunding platforms. Although they share many similarities, there are some subtle differences. It is noted here that Indiegogo is a more open platform where the barrier to setup and execute a crowdfunding campaign is much lower than the relatively more strict campaign approval process on Kickstarter.  In addition, the concept of “success” can be interpreted slightly different on Indiegogo, since the platform allows for “flexible funding” campaigns.

HiveWire is a crowdfunding solutions company, in addition to doing market research and data analysis in the crowdfunding industry, we develop custom platforms, offer corporate and campaign consulting, as well as hold regular crowdfunding workshops. The analysis presented here is simply a subset of HiveWire’s data and insights collection. We have ongoing analytics and insights projects for internal use and external customers. If you are interested in HiveWire services or products please contact us.