ANNIE – Developer Notes (Short Version)

Frontend (Voice Interface)

Main Functions

  • Speech Recognition (webkitSpeechRecognition):
    • Starts/stops on pyramid click
    • Converts user speech to text
    • Sends text to backend via sendMessage()
recognition = new webkitSpeechRecognition();
recognition.onresult = (event) => {
  heardText = event.results[i][0].transcript.trim();
  sendMessage(heardText);
};


## Speech Synthesis (SpeechSynthesisUtterance):

- Cancels current speech before new one
- Picks female-like voice (e.g., Google, Samantha, Jenny)


```python
function speakText(text) {
  speechSynthesis.cancel();
  const utterance = new SpeechSynthesisUtterance(text);
  utterance.voice = selectedVoice;
  speechSynthesis.speak(utterance);
}

UI Controls:

  • Pyramid click toggles listening
  • “Stop Speaking” button cancels speech

Backend (Flask + Gemini AI)

Main Components

Gemini Model Setup (biotech-focused persona)

model = genai.GenerativeModel(
  model_name="gemini-1.5-flash",
  generation_config={...},
  system_instruction="Expert in biotech and Illumina systems..."
)

Chat Endpoint

@dnabot_api.route('/chat', methods=['POST'])
def chat():
  user_input = request.json.get('user_input', '')
  response = model.start_chat().send_message(user_input)
  return jsonify({"response": response.text})

Workflow Summary

  • User speaks → browser converts to text
  • Text sent to backend /dnabot/chat
  • Gemini responds → frontend speaks it