eds

EGN 4930 / EGN 5932C Environmental Data Science - Term Project

1. Project Assignment

1.1 Project Description

You are expected to conduct a project using Python on a water or environmental issue of your choice. You should develop a research question or problem statement that guides your analysis. You need to use Python and its various libraries to process, analyze, and visualize data to answer your research question or solve an industry-or community-oriented problem. Graduate students may work individually or in group of up to four. Undergraduate students need to work in a group of three to five. Throughout the project, you will have the opportunity to apply and deepen your skills in data analysis and visualization, critical thinking, and independent learning, in alignment with the project objectives.

1.2 Project Objectives

Your project should demonstrate your ability to perform the following tasks.

  1. Data analysis and visualization: Conduct a project using Python, encompassing data analysis, and visualization to address a water or environmental issue of your choice
  2. Research question/problem statement: Develop a research question or problem statement that guides your analysis
  3. Data acquisition and processing: Skillfully access, manage, wrangle, and analyze complex datasets with diverse data formats and from diverse sources such as Google Earth Engine, USGS ScienceBasePy, NOAA, Data.Gov, peer-reviewed articles, among others
  4. Python libraries and tools: Apply appropriate Python libraries and tools for data manipulation, analysis, and visualization, such as Jupyter, Pandas, NumPy, Matplotlib, Xarray, Cartopy, Google Earth engine, Geemap, FloPy, and TensorFlow
  5. Result evaluation and interpretation: Critically evaluate and interpret the results of your analysis, and draw meaningful conclusions that address your original research question or problem statement
  6. Code documentation and best practices: Deliver clear and well-documented Jupyter notebook with Python codes following the best coding practices
  7. Data communication: Effectively communicate your findings by documenting, visualizing, sharing and presenting your data insights in a clear and concise manner utilizing tools such as Jupyter notebook, GitHub, and Binder
  8. Team collaboration: If applicable, contribute meaningfully to the project, and demonstrate effective teamwork skills including communication, collaboration, and conflict resolution
  9. Reporting: Deliver a technically proficient final report following recognized academic or professional standards (e.g., Guide to Technical Report Writing)
  10. Presentation: Deliver a clear, organized, and effective class presentation to disseminate the project’s findings to a wider audience

1.3 Schedule and Assessment

To ensure the project is feasible, schedule a meeting with the instructor to discuss your proposed project and obtain approval. Meetings are typically around 15 minutes, but may be extended as needed. You need to obtain approval before you submit your project abstract. You need to schedule a meeting with the instructor during office hours on Tuesday and Thursday, after the class, or schedule an appointment by email.

The project will be graded according to the following schedule:

The due dates of the above deliverables will be posted on Canvas.

1.4 Support

The instructor is available to provide any needed guidance and support. You are encouraged to seek help when necessary and to actively discuss your project with the instructor.

1.5 Submission

Submission dates will be posted on Canvas. Please ensure that your work is submitted on time. Late submissions will result in 10% deduction per day. Good luck, and we look forward to seeing your creative applications in water and environmental data science with Python.

2. Rubrics

The rubrics serve as a general guide to understand the evaluation criteria and expectations. The rubrics are subject to change to the benefit of the students. The rubrics may be adjusted based on student feedback to ensure a more relevant and engaging learning experience.

2.1 Project Summary Rubric (10%)

Criteria Excellent (5.0) Good (4.5) Satisfactory (4.0) Needs Improvement (3.5) Inadequate (3.0)
Research Question/Problem(40%) Concise, focused, and novel research question or industry-oriented problem clearly stated. Clear and reasonably focused research question or problem stated. Research question or problem stated but lacks clarity. Vague research question or problem statement. No discernible research question or problem statement.
Alignment with Project Objectives (20%) Demonstrates a strong connection to the project objectives, showcasing an understanding of the tools and techniques to be employed. Shows a good understanding of the project objectives. Demonstrates a basic understanding of the project objectives. Limited connection to project objectives. No evident connection to project objectives.
Relevance and Significance (10%) Clearly explains why the chosen research question or problem is relevant, significant, and challenging. Explains the relevance and significance of the research question or problem. Somewhat explains the relevance and significance. Limited explanation of relevance and significance. Fails to address the relevance and significance.
Feasibility and Scope (20%) Clearly outlines the feasibility and scope of the project, indicating a well-considered plan for implementation. Provides a reasonable outline of feasibility and scope. Offers a somewhat feasible and scoped plan. Feasibility and scope are unclear. Feasibility and scope are not addressed.
Clarity and Quality (10%) Extremely clear, concise, and well-written abstract. Minor errors. Well-written abstract with good clarity. Few errors. Reasonably well-written abstract but needs improved clarity or concision. Multipe errors. Poorly written abstract lacking clarity and concision. Major errors. Very poorly written or unfocused abstract. Major errors and omissions.

Note: Assign numerical values (5.0, 4.5, 4.0, 3.5, 3.0) to each level for calculating the overall score.

2.2 Interim Report Rubric (10%)

Criteria Excellent (5.0) Good (4.5) Satisfactory (4.0) Needs Improvement (3.5) Inadequate (3.0)
Introduction and Background (40%) Provides a comprehensive introduction, clearly defining the project’s background and context. Offers a good introduction with relevant background information. Presents a basic introduction with some background information. Introduction lacks depth and context. No clear introduction or background information.
Project Objectives and Scope (20%) Clearly outlines the specific objectives and scope of the project, demonstrating a well-defined plan. Provides a reasonable outline of project objectives and scope. Offers a somewhat clear outline of objectives and scope. Objectives and scope are unclear. Objectives and scope are not addressed.
Methods and Tools (10%) Clearly explains the methods and tools used for data analysis, showcasing a well-thought-out approach. Provides a good explanation of methods and tools used. Presents a basic overview of methods and tools. Methods and tools are unclear or poorly explained. Methods and tools are not addressed.
Challenges Faced (10%) Clearly identifies and discusses challenges encountered during the project, along with proposed solutions. Recognizes and discusses challenges with some proposed solutions. Acknowledges challenges without offering adequate solutions. Challenges are mentioned but not discussed in-depth. Fails to identify or address challenges.
Preliminary Findings (10%) Presents clear and insightful preliminary findings from the data analysis conducted. Offers reasonable preliminary findings with some insights. Provides basic preliminary findings with limited insights. Preliminary findings lack clarity or depth. No clear or meaningful preliminary findings.
Clarity and Quality (10%) Extremely clear, concise, and well-written report. Minor errors. Well-written report with good clarity. Few errors. Reasonably well-written report but needs improved clarity or concision. Multipe errors. Poorly written report lacking clarity and concision. Major errors. Very poorly written or unfocused report. Major errors and omissions.

Note: Assign numerical values (5.0, 4.5, 4.0, 3.5, 3.0) to each level for calculating the overall score.

2.3 Final Report Rubric (50%)

Criteria Excellent (5.0) Good (4.5) Satisfactory (4.0) Needs Improvement (3.5) Inadequate (3.0)
Introduction (10%) Provides a compelling introduction that clearly outlines the context, background, and significance of the project topic. Provides a clear introduction with relevant background context and information on the project’s significance. Basic introduction that lacks depth but provides some background context. Vague introduction with limited context or background information provided. No introduction or background information provided.
Objectives and Scope (10%) Project objectives and scope are extremely well-defined, relevant, and achieved within the project. Objectives and scope are well-defined, relevant to the project topic, and largely achieved. Objectives and scope are reasonably clear but could be more focused or lack full achievement. Objectives and scope lack clarity, relevance, or are not fully achieved. No objectives or scope provided.
Methods and Data (20%) Sophisticated methods applied; high-quality, relevant datasets obtained and preprocessed. Appropriate methods applied; good quality datasets obtained and adequately preprocessed. Basic methods applied; reasonable data obtained with minimal preprocessing. Suboptimal methods or data used, with insufficient preprocessing. Poor methods or data, no preprocessing.
Analysis and Visualization (20%) Excellent data analysis and insightful visualizations that strongly support the project goals. Good analysis and effective visualizations that align with project goals. Satisfactory analysis and visualizations, with room for improvement. Limited analysis or weak visualizations that provide minimal insights. Little to no analysis or visualizations.
Discussion (5%) Thorough, insightful discussion of significant findings and implications. Good discussion of main findings and implications. Basic discussion of results and implications. Vague or superficial discussion of findings. No discussion of results or implications.
Conclusion (5%) Excellent summary of findings, contributions, limitations, and specific future work. Good conclusion summarizing key points and suggesting future work. Basic conclusion with some summary of findings and general future work. Vague conclusion with minimal summary or future work. No conclusion provided.
Data Availability (20%) Code and data fully documented following best practices, and shared along with the report and presentation on GitHub with Binder link. Code and data reasonably well documented and shared along with the report and presentation on GitHub Basic code/data sharing with room for improvement on GitHub Code/data sharing attempted but poorly executed and report not shared on GitHub No code or data sharing.
Overall Report and Code Quality (10%) Extremely well-written and structured. No errors. Exceeds expectations. Well-written and structured with minimal errors. Meets expectations. Reasonably well-written but needs improvement in structure or clarity. Poorly written or structured. Contains multiple errors. Very poorly written or structured. Many errors.

Note: Assign numerical values (5.0, 4.5, 4.0, 3.5, 3.0) to each level for calculating the overall score.

2.4 Project Presentation Rubric (20%)

Criteria Excellent (5.0) Good (4.5) Satisfactory (4) Needs Improvement (3.5) Inadequate (3.0)
Content and Organization (20%) Presentation is well-organized, with clear structure and a logical flow. Presentation is organized, with a good structure and logical flow. Presentation has a basic organization and structure. Organization is unclear, and the structure lacks coherence. Presentation lacks organization, with no clear structure or logical flow.
Clarity of Communication (20%) Ideas are communicated clearly and effectively. Language and terminology are appropriate for the audience. Ideas are generally clear, but there may be some instances of ambiguity or unclear communication. Language is mostly appropriate. Communication is basic, with occasional difficulties in clarity. Language may not be consistently suitable for the audience. Communication is unclear, making it challenging for the audience to follow. Communication is extremely unclear, and the audience cannot understand the content.
Use of Visual Aids (20%) Visual aids (slides, charts, graphs) are highly effective, enhancing understanding and engagement. Visual aids are generally effective, supporting the content and enhancing understanding. Visual aids are basic, and their effectiveness varies. Some may not contribute significantly to understanding. Visual aids are not effective or may distract from the content. Their use does not enhance understanding. No visual aids are used, or they are entirely irrelevant to the presentation.
Engagement and Interaction (20%) Presenter engages the audience effectively through enthusiasm, eye contact, and responsive interaction. Presenter engages the audience, but there may be moments of less enthusiasm or limited interaction. Engagement is basic, with occasional disconnection from the audience. Limited interaction is observed. Presenter lacks enthusiasm and struggles to maintain audience engagement. Interaction is minimal. Presenter is disengaged, and there is no interaction with the audience.
Time Management (20%) Presentation adheres closely to the assigned time, allowing for effective coverage of all key points. Presentation mostly adheres to the assigned time, with minor deviations. Key points are adequately covered. Presentation deviates noticeably from the assigned time, affecting the coverage of key points. Significant deviations from the assigned time impact the coverage of key points. Presentation significantly exceeds or falls short of the assigned time, compromising the coverage of key points.

Note: Assign numerical values (5.0, 4.5, 4.0, 3.5, 3.0) to each level for calculating the overall score.

2.5 Peer-Assessment Rubric (10%)

Check this link for peer-assessment criteria, form, and submission instructions.

3. Service learning (optional)

You have the option to count service learning hours while working on your project. To qualify, your project should address a direct or research need for a community partner that is not for profit. Community partners could benefit from projects involving data analysis, predictive modeling, GIS-based spatial analysis, or workflow automation using Python libraries such as Pandas, Google Earth engine, GeeMap, and Scikit-learn. For example, one student conducted a project on water quality analysis with the Sanibel-Captiva Conservation Foundation, while another automated workflows for Naples Botanical Garden.

You can search for a community partner that aligns with your interests in the FGCU Community Partners Database. Potential partners include the Florida Department of Environmental Protection, U.S. Army Corps of Engineers, FGCU, or similar organizations. Alternatively, you can contribute to an FGCU research project such as:

You may work with the instructor on these projects, or collaborate with other faculty on any project that meets the project objectives. This is an opportunity to apply course concepts to real-world community challenges, while fulfilling service learning requirements.

Action Steps

To count service learning hours, follow these steps:

4. FAQ

The instructor will regularly update this section throughout the project to answer common questions and clarify any uncertainties that students may have. Make sure to check this section before submitting your project deliverables.

1. Are there any project themes or topics that we are restricted to?

While there is no strict limitation on project themes, your chosen topic should fall under the broader umbrella of water and environmental data science. This includes, but is not limited to, areas such as environmental science, ecological studies, hydrology, environmental engineering, climate change impacts, environmental economics, urban planning, demographic changes, public health concerns, social media discourse, and social sciences research, all with a focus on water, climate, or environmental issues. Your project should demonstrate the application of Python for data analysis, visualization, and problem-solving in these interconnected fields.

2. In case I do not have a project idea, how can I find one?

The key is to choose a project that aligns with your interests and allows you to demonstrate your skills in Python and water and environmental data analysis. To get project ideas you can:

3. For the project summary, what do you mean by “plan for implementation” ?

I recommend focusing on a question or hypothesis that does not require extensive data collection or complex modeling, which can be time-consuming. Given the number of hours that you will spend on this project per week, you need to have a realistic plan to complete your project on time. Once you decide on your research/management/business question or hypothesis, you can outline a plan for implementation. Here is an exampe.

Remember to allocate some time for unexpected challenges and iterative improvement.

4. Will I lose points for incomplete documentation of code?

Yes. Incomplete documentation of the code will result in a deduction of marks, particularly if markdown cells are not utilized for annotations. In this course, you are not only learning good coding skills but also good coding practices (GCP). Markdown cell annotations improves readability and understanding for both users and your future reference. By providing narrative descriptions alongside the code, it becomes easier to understand code functionality and purpose, facilitating understanding and future modifications.

For example, please review your lessons and homework solutions, where I use markdown cells to provide narrative context. This transforms the notebook into not just code, but a story to tell. Has not this enhanced your learning experience?

This is another example from real life for a report with accompanying annotated code. While you do not need to replicate this level of detail for your project, it serves as a real-life example.

For further guidance, refer to ‘GCP 2 - Describing your code’ in the ‘Good Coding Practice (GCP)’ module on Canvas. Additionally, ensure adherence with the final report rubric.

5. Where should I upload the code and do I need to use GitHub?

Using GitHub or a similar platform is required for a higher grade. Refer to the Data Availability criterion in the final report rubric for specific requirements. This aligns with data communication and best coding practices outlined in your project objectives. Adherence to these practices contributes to a better grade according to the final report rubric.

To upload your code and data to GitHub, follow these steps:

  1. Create a GitHub account.
  2. Create a new repository.
  3. Cite your code repository in your report.
  4. Upload your data and code to your GitHub repository.
  5. Upload your presentation and report to your GitHub repository.

Additionally, with Binder, you can open your notebooks in an executable environment, making your code promptly reproducible by anyone without the need to install Python or Jupyter.

If all your data and code are on GitHub and cited in your report, you only need to submit your report to Canvas. If you did not share your code on GitHub, you must submit both the report and code on Canvas.

If you prefer not to share your data on GitHub, you can still share your code or part of it to demonstrate your use of GitHub. In this case, upload your data and code to Canvas.

Whether you uploaded your code on Canvas or GitHub, for grading purposes, ensure that your code is fully functional.

5. Student projects

Spring 2025

  1. Predictive modeling of red tide events using machine learning: Link
  2. Machine learning of algal blooms from Lake Okeechobee discharge: Link
  3. Coral Reef Bleaching in the State of Florida during COVID-19: Link
  4. Evaluating downscaled precipitation data in Collier County: Link
  5. Contaminant transport in complex groundwater regime: Link

Spring 2024

  1. Plasma proteomics of loggerhead sea turtles (Caretta caretta) to establish diagnostic biomarkers for brevetoxicosis: Link
  2. Seasonal and spatial variations of nutrient concentrations in the Sanibel Slough: Implications for Watershed Management: Link
  3. Identification and analysis of upwelling events in the Gulf of Mexico and their connection to K. brevis blooms on the West Florida shelf using remote sensing data: Link
  4. Dynamic Interplay: Hurricanes and surface water salinity levels: Link
  5. Investigating the groundwater drawdown using FloPy package for MODFLOW groundwater modeling: Link
  6. Aligning plant biodiversity data of the Naples Botanical Garden with Global Biodiversity Information Facility using pandas and ArcGIS Online Notebooks: Link
  7. Exploring machine learning models for red tide prediction in the Gulf of Mexico