Planning Process

  • First, the career quiz was meant to be a Study Resource Table which would appear after a user plays a trivia game. However, in order to make this its own, distinct feature, I changed this to be its own interactive quiz to help users determine their career path based on their love for biotechnology
    • There is a set of 5 questions that ask about the user’s interest in biotechnology
    • The score they get will be matched to the scores of other people from a kaggle dataset to determine their career path
    • The user will be able to compare their results to that of the dataset and get an idea of what they could achieve in the future based on past data

Backend Code

  • This is what my backend code looks like:
from flask import Blueprint, request, jsonify
import pandas as pd
import os  # Keep this to work with file paths

# Create Blueprint for handling resource requests
resource_api = Blueprint('resource_api', __name__, url_prefix="/api")

# Modified: Load local CSV instead of Kaggle
def load_dataset():
    """Loads the student-scores dataset from local CSV file."""
    try:
        csv_file = os.path.join(os.path.dirname(__file__), "student-scores.csv")
        df = pd.read_csv(csv_file)
        return df
    except Exception as e:
        print(f"Error loading dataset: {e}")
        return None
    
@resource_api.route('/get_careers', methods=['GET'])
def get_careers_by_biology_score():
    df = load_dataset()
    if df is None:
        return jsonify({"error": "Dataset not found"}), 500

    if "biology_score" not in df.columns or "career_aspiration" not in df.columns:
        return jsonify({"error": "Required columns missing in dataset"}), 500

    df = df[df["biology_score"].notnull()]
    
    target_score = request.args.get("biology_score", type=float)

    if target_score is not None:
        df["score_diff"] = (df["biology_score"] - target_score).abs()
        min_diff = df["score_diff"].min()
        closest_matches = df[df["score_diff"] == min_diff]
    else:
        # If no score, just return first 5 careers
        closest_matches = df.head(5)

    return jsonify({
        "careers": closest_matches[["career_aspiration", "biology_score"]].to_dict(orient="records")
    })
  • This code defines an API endpoint /api/get_careers that loads data from a CSV file I got from Kaggle, which includes columns like biology_score, career_aspiration, and more. It ensures the dataset is available and properly formatted before moving forward.

  • When a user provides a biology_score through the URL (like in a frontend quiz or Postman request), the API calculates which careers in the dataset have the closest biology score by comparing absolute differences and then returns the closest careers.

  • If no biology_score is given, instead of returning an error, the API simply defaults to showing the first five careers in the dataset. This ensures that the API stays functional even without input and doesn’t break anything on the frontend.

As you can see here, when I run it in Postman, it shows that the biology score and its corresponding career which is later used to make the closest possible match to the user’s score. career

Future Improvements

  • Add more questions to the quiz
  • Fix the header at the top of the page
  • Add a button that resets the quiz
  • Improve the style of the entire page