Skip to main content

Tutorials

Seven runnable end-to-end examples, ordered from simplest to most involved. Each pairs a TweetDataset method with the kind of input it expects and shows the actual output you'll see.

#TutorialMethod exercisedNotes
01Quickstart: top hashtagshashtag_histogram_alt_pythonMinimum viable usage. Start here.
02Mention histogrammention_histogram_alt_pythonSame shape as #01 but for @user tokens.
03Bilingual n-gramsngram_histogram_alt_python with lan='spanish' / lan='english'Stopword handling per language.
04Spanish sentiment rangesentiment_range_spanish_alt_pythonFirst run is slow: loads a TensorFlow model.
05Hashtag co-occurrence networkhashtag_weighted_coonetReturns an igraph.Graph.
06Mention co-occurrence networkmention_weighted_coonetSame as #05, on mentions.
07R-bridge: top hashtagshashtag_histogram_rThe R-bridge path. Worker container only.

The pages here are auto-generated from each examples/<slug>/README.md by scripts/sync_tutorials.py. Edit the source README; CI fails on drift.

How to read them

Each tutorial has the same three sections:

  1. What you'll see: the actual stdout the example prints, so you can match it against your run.
  2. How it works: the call chain from TweetDataset method down through base_algs.
  3. Run it: exact shell commands to bring up the cluster, run the example, and tear it down.

Prerequisites

  • Whistlerlib installed on the client (see Install from PyPI).
  • A running Whistlerlib cluster on localhost:8786 (see Install with Docker).
  • For tutorial 07: the cluster must be running the albertogarob/whistlerlib image (R lives only there).