# START Engagement Report Generator

## Overview

This tool automatically compiles user engagement data from multiple CSV exports into a comprehensive engagement report. The script processes data from the reSTART Drupal backend exports and generates a consolidated report showing participant activity across all modules and features of the START app.

## What This Script Does

The `generate_engagement_report.py` script aggregates data from multiple CSV files to create a comprehensive engagement report that includes:

- **User Identification**: Participant ID (PID) and onboarding date
- **Check-in Activity**: Total check-ins and check-ins within the past 15 days
- **App Usage**: Number of times participants opened the app
- **Module Completion Tracking**:
  - Intro modules (Welcome, Self Care, Stimulant Use & HIV)
  - Positive Events activities (Good Things, Gratitude Journaling, Meditation)
  - Mindfulness practices (Informal Mindfulness, Self Compassion, Meditation)
  - Reappraisal exercises (Reappraisal, Meditation)
  - Values activities (Intro, Values, Strengths, Goals, Meditation)
  - Kindness practices (Kindness, Meditation)
- **Video Engagement**: Total videos watched per participant
- **Reminder Settings**: Number of active reminders configured

The output is a CSV file formatted to match the standard START engagement report structure.

## Required Data Files

The script expects the following CSV files to be present in the same directory:

| File Name | Description | Key Fields |
|-----------|-------------|------------|
| `check-in-data.csv` | Daily check-in submissions | Name, Submitted Date, Meds, Mood |
| `time-in-app.csv` | App session timestamps | User, start_time, end_time, Authored on |
| `intro-status.csv` | Intro module completions | Name, type, number_of_times_reported |
| `positive-events-status.csv` | Positive events activities | Name, type, number_of_times_reported |
| `mindfulness-status.csv` | Mindfulness practice completions | Name, type, number_of_times_reported |
| `reappraisal-status.csv` | Reappraisal exercise completions | Name, type, number_of_times_reported |
| `values-status.csv` | Values module activities | Name, type, number_of_times_reported |
| `kindness-status.csv` | Kindness practice completions | Name, type, number_of_times_reported |
| `videos-watched.csv` | Video viewing counts per user | Name, number_of_times_reported |
| `reminders.csv` | Reminder configurations | User, Daily Reminder, Weekly Reminder |

### Optional Files (Not Used in Report Generation)

The following files may also be present from Drupal exports but are not required for the engagement report:

- `amplified-practice.csv` - Additional practice tracking
- `contact-us.csv` - User support requests
- `forgot-pin-requests.csv` - PIN reset requests
- `videos-watched-count.csv` - Video statistics by URL
- `videos-watched-count-by-user.csv` - Detailed video viewing by user and URL

## Data Preparation

### Exporting Data from reSTART Drupal Backend

1. **Export Required Data**: From the reSTART Drupal backend, export each of the required datasets listed above
2. **Save as CSV**: Ensure all exports are saved in CSV format with UTF-8 encoding
3. **Use Consistent Naming**: Name the exported files exactly as shown in the "Required Data Files" table above
4. **Place in Directory**: Copy all CSV files to the `data` subdirectory (the script expects all input files in the `data` folder)

### Data Format Requirements

- **CSV Format**: All files must be properly formatted CSV with comma delimiters
- **Headers**: First row should contain column headers matching the field names
- **Encoding**: UTF-8 encoding is required for proper handling of special characters
- **User Identifiers**: The "Name" or "User" field should contain consistent participant usernames across all files

## How to Run the Script

### Prerequisites

- **Python 3**: The script requires Python 3.6 or later
- **Standard Library Only**: No additional Python packages need to be installed (uses only built-in modules)

### Running the Script

1. **Open Terminal**: Navigate to the project root directory

   ```bash
   cd /path/to/start-reports
   ```

2. **Execute the Script**: Run the Python script

   ```bash
   python3 generate_engagement_report.py
   ```

   The script will read input files from the `data` subdirectory and generate the output in the root directory.

3. **Locate Output**: The script will generate a timestamped CSV file in the root directory

   ```
   engagement_report_YYYYMMDD_HHMMSS.csv
   ```

   Example: `engagement_report_20260115_023836.csv`

### Expected Output

When the script runs successfully, you will see:

```
============================================================
START Engagement Report Generator
============================================================

Loading check-in data...
Loading time in app data...
Loading intro status...
Loading positive events status...
Loading mindfulness status...
Loading reappraisal status...
Loading values status...
Loading kindness status...
Loading videos watched...
Loading reminders...

Report generated successfully: /path/to/start-reports/engagement_report_20260115_023836.csv
Total users: 173

============================================================
Report generation complete!
============================================================
```

Note: Input CSV files are read from the `data/` subdirectory, while the output report is saved to the root directory.

## Output File Format

The generated engagement report follows this structure:

- **Header Row**: Column names for all metrics
- **Reference Row**: Example values showing expected data format (labeled "# FROM STUDY TEAM")
- **Data Rows**: One row per participant, sorted alphabetically by username

### Column Headers

```
PID, Onboard Date, check-in, CHECK-in Past 15 Days, #times in app,
Intro Welcome, Intro Selfcare, Intro Stim Use,
PosEvents GoodThings, PosEve Grat Journal, PosEve Meditation,
Mindful Informal Mindfulness, Mindful Self Compassion, Mindful Meditation,
Reappraisal Reappraisal, Reappraisal Meditation,
Values Intro, Values Values, Values Strengths, Values Goals, Values Meditation,
Kindness Kindness, Kindness Meditation,
Videos, Reminders
```

## Troubleshooting

### Missing Data Files

If a required CSV file is missing from the `data/` directory, the script will display a warning but continue processing:

```
Warning: /path/to/data/filename.csv not found
```

The report will still be generated with zeros for metrics from missing files.

### No Data for a Participant

Participants appear in the report if they have data in at least one of the input CSV files. If a participant has no data in a particular category, that metric will show as `0` in the report.

### Date Parsing Issues

The "CHECK-in Past 15 Days" calculation depends on the `Submitted Date` field in `check-in-data.csv` being in `YYYY-MM-DD` format. If dates are in a different format, that column may not calculate correctly.

### Encoding Errors

If you encounter encoding errors, ensure all CSV files are saved with UTF-8 encoding. Most CSV editors (Excel, Google Sheets) can export with this encoding.

## Maintenance and Updates

### Modifying the Script

The script is organized into methods for each data source:

- `load_check_in_data()` - Processes check-in submissions
- `load_time_in_app()` - Counts app sessions
- `load_intro_status()` - Aggregates intro module completions
- `load_positive_events_status()` - Tracks positive events activities
- `load_mindfulness_status()` - Compiles mindfulness practices
- `load_reappraisal_status()` - Aggregates reappraisal exercises
- `load_values_status()` - Tracks values activities
- `load_kindness_status()` - Compiles kindness practices
- `load_videos_watched()` - Counts video views
- `load_reminders()` - Counts active reminders

To modify how a particular metric is calculated, edit the corresponding method.

### Adding New Metrics

To add a new metric to the report:

1. Add the field to the `user_data` default dictionary in `__init__()`
2. Add the column name to the `headers` list in `generate_report()`
3. Create a new method to load the data (or modify an existing one)
4. Call the new method in `load_all_data()`
5. Add the field to the row dictionary in `generate_report()`

## Support

For questions or issues related to:

- **Script functionality**: Review the code comments in `generate_engagement_report.py`
- **Data exports**: Consult the reSTART Drupal backend documentation
- **Report interpretation**: Contact the START study team

## Version History

- **v1.1** (February 6, 2026) - Directory restructure
  - Reorganized data files into `data/` subdirectory
  - Output files remain in root directory
  - Updated documentation to reflect new structure
- **v1.0** (January 2026) - Initial release
  - Processes 10 data sources from Drupal exports
  - Generates comprehensive engagement metrics
  - Supports 173 participants
  - Automatic timestamp for output files
