Week 6

From the 5 Rules of Data Normalization poster:

  • Unnormalized data
    • redundant and repetitive data (jagged length rows)
  • 1st Normal Form (1NF) – eliminate repeating groups
  • 2NF – eliminate redundant data
  • 3NF and 3.5NF BCNF (Boyce-Codd NF) – eliminate columns not dependent on key
    • 3NF on the poster is really BCNF
    • oath: Each attribute in a table must be a fact about the key, the whole key, and nothing but the key, so help me Codd.
  • Higher normal forms
    • 4NF – isolate independent multiple relationships
    • 5NF – isolate semantically related multiple relationships
  • Recommendations DB for Books, TV shows, movies…
  • Student-run Rock museum - WARTHIN (local - Vassar)
  • Business accelerator / Venture Capital (source: non-profit.org)
  • Olympic-related (countries, events, ticket sales) (Kaggle)
  • Music Library - collections, composer, key, genre
  • Alumni - maybe CS-specific? (like Vassar Net)
  • Last.FM (Spotify, YouTube) - common tracks among users, most common for an individual user
  • Pokeman DB
  • Animals used in Scientific Research, what we do to them, etc. (local to Vassar)
  • Start-up DB - what kinds, trends for success/failure
  • Rate My Professor - integrate with AskBanner
  • Deece DB (menus, by day, when are certain foods served?)
  • Boardgame DB (search for board games based on attributes)
  • Data Analysis - from Government, pre/post pandemic, etc.
  • NFL DB - statistics, analysis, consistency among players
    • variation: at the team level, predictions
  • what database project would you like to work on?
    • it doesn't need to be one of the ones we brainstormed–that was just to get us thinking about the possibilities
  • who would you like to partner with? (ideally groups of two)
  • do you need help finding a partner?