What kind of data are you working with?

This website intends to guide you towards computational or digital tools that you can use for your research, even if you don't yet know exactly what you want to do. With the exception of Social media, we try to distinguish the type of materials from the method being used to collect it. If you work with survey or interview data that contains digitised free-text, you may want to look into Text. The same applies if you are studying text in books or newspapers. If your interviews are recordings, then look into Audio or Video instead. Please note that this list of tools does not intend to be exhaustive, but rather give a curated set of recommendations to help you get started.

Pick the main type of data you have by using the buttons below, then let the decision tree guide you to the right tool. Click on the tool name to learn more about it.


Prerequisites
R
Python
Other coding languages
None

Social media

Social media data include structured and unstructured data collected (with permission) from platforms such as Facebook or Instagram. Tools in this section are used to retrieve content from these sites using APIs, perform social network analysis (SNA), and visualise user relationships through network diagrams.

  • Social media
    • Collect
          • OpenRefine
      • API
          • Mastodon API php
          • NodeXL Pro
          • PRAW
          • Python-YouTube
          • YouTube Data Tools
    • Clean or prepare data
        • dplyr and tidyr
        • OpenRefine
    • Analyse
      • Import data
          • httr2
          • Requests
      • Visualise
          • Gephi
          • GraphViz
          • igraph
          • Netlytic
          • NetworkX
      • Annotate
          • NVivo
      • Social network analysis
          • NodeXL
    • Publish
        • figshare
        • WordPress

Images

Image data includes photos captured with a camera or other images scanned from physical media. This category includes historical photographs or ads displayed on printed media. Tools in this section are used to edit images, and to extract and analyse information from these images.

  • Images
    • Collect
      • Web scraping
          • Beautiful Soup
          • rvest
    • Edit
        • Affinity Designer 2
        • Affinity Photo 2
        • GIMP
        • Image J
        • ImageMagick
    • Analyse
      • Import data
          • Requests
      • Image classification
          • Fastai
          • keras
          • TensorFlow
      • Image processing
          • magick
          • Pillow
    • Publish
        • figshare
        • Omeka Classic
        • WordPress

Text

Textual data include extensive collections of books as well as short social media posts expressing emotions, and everything in between. Tools in this section will help you to collect (or mine) text from various sources, enrich primary data with annotations and analyse it using multiple qualitative and quantitative methodologies.

  • Text
    • Collect
          • OpenRefine
      • Image to text
          • Tesseract
      • Web scraping
          • Beautiful Soup
          • rvest
          • Scrapy
          • Selenium
      • Survey
          • Qualtrics
          • REDCap
    • Clean or prepare data
        • dplyr and tidyr
        • OpenRefine
        • tidytext
    • Analyse
      • Import data
          • httr2
          • qualtRics
          • Requests
      • Cluster
          • Apache Open NLP
          • GloVe
          • Mallet
          • NLTK
          • quanteda
          • spaCy
          • spacyr
          • text2vec
          • tokenizers
          • Voyant Tools
      • Categorise
          • Apache Open NLP
          • Ngram Viewer
          • NLTK
          • Voyant Tools
      • Corpus analysis
          • AntConc
          • Mallet
          • NLTK
          • Voyant Tools
      • Annotate
          • CATMA
          • NVivo
      • Topic modelling
          • GenSim
          • lda
          • Mallet
          • text2vec
          • TMT Toolbox
          • topicmodels
      • Sentiment analysis
          • GenSim
          • NLTK
          • sentimentr
          • spaCy
          • spacyr
          • syuzhet
          • TMT Toolbox
          • VADER
    • Publish
        • figshare
        • Omeka Classic
        • WordPress

Audio

Sound data include recordings of music, interviews, podcasts, and audiobooks. Tools in this section allow you to import sound files into a computer, edit or analyse their acoustic properties, or transcribe their contents.

  • Audio
    • Annotate or transcribe
        • Elan
        • Otter.ai
    • Edit
        • Audacity
        • Twisted Wave
    • Analyse
      • Annotate
          • NVivo
          • Librosa
          • Praat
    • Publish
        • figshare
        • Omeka Classic
        • WordPress

Geospatial

Geospatial data consists of coordinates specifying the latitude and longitude of a location, as well as qualitative and quantitative data associated with a particular place. Tools in this section allow you to visualise coordinates and associated information on a map, or obtain coordinates for places of interest.

  • Geospatial
    • Collect
        • ArcGIS Hub
        • AURIN Data Catalogue
        • GeoPy
        • OpenRefine
        • OpenStreetMap (OSM)
        • TLCmap
    • Clean or prepare data
        • OpenRefine
    • Prepare data
        • GeoPandas
        • Map Warper
    • Analyse
        • ArcGIS
        • Palladio
        • QGIS
        • sf
    • Visualise
        • Deck.gl
        • Folium
        • Kepler GL
        • Leaflet
        • StoryMap JS
        • TLCmap
    • Publish
        • Google My Maps
        • Mapbox Studio
        • StoryMap JS

Video

Video data includes films, interview recordings, and archival footage. Tools in this section will help you to produce, edit, or annotate video data.

  • Video
    • Annotate or transcribe
        • Elan
    • Convert format
        • HandBrake
    • Publish
        • figshare
        • Omeka Classic
        • WordPress

Spreadsheet

Spreadsheet data refers to information in table format that can be opened using Excel or similar, for example files with extension .xlsx or .csv. Spreadsheet data is also referred to as tabular or structured data. This can include survey responses, scraped website content, historical statistics, or social media data. Spreadsheet data may consist of either numbers or text, and is organised into rows and columns. Tools in this section allow researchers to perform numerical calculations on data and produce visualisations describing their statistical properties.

  • Spreadsheet
    • Collect
          • OpenRefine
      • Web scraping
          • Beautiful Soup
          • rvest
          • Scrapy
      • Survey
          • Qualtrics
          • REDCap
    • Clean or prepare data
        • dplyr and tidyr
        • OpenRefine
        • Pandas
    • Analyse
      • Import data
          • qualtRics
          • Requests
      • Quantitative
          • Minitab
          • Scikit-Learn
          • SPSS
    • Visualise
        • ggplot2
        • Matplotlib
        • Plotly
        • Seaborn
    • Publish
        • figshare
        • WordPress

About

This site was created by the Melbourne Data Analytics Platform's (MDAP) HASS Taskforce, in collaboration with David Goodman, director of the Faculty of Arts' Digital Studio. Mar Quiroga led the project, Mitchell Harrop produced the initial protoype, Jair Garcia-Mendoza, Jean Dinco, and Alex Shermon collated the content, Nadya Ulibasa gathered user feedback and improved the usability of the site, and all members of MDAP kindly volunteered their expertise for curation and content of tools.

Contact us

If you are still confused about what tool to use for your research, or you would like to have a chat about your specific circumstances, please write to us at hass-taskforce@unimelb.edu.au.