Sale!

CSE4/510: Reinforcement Learning Final Project

$30.00

CSE4/510: Introduction to Reinforcement Learning

Final Course Project
Applied Deep Reinforcement Learning
Multiple deadlines, please see below
Description
The goal of the Final Course Project is to explore advanced methods and/or applications in reinforcement
learning. You will be expected to prepare a proposal, milestone report, final report, and final presentation.
All projects should evaluate novel ideas that pertain to deep RL or its applications. The project must involve
reinforcement learning algorithms. You are encouraged to use your ongoing research work as a project in
this course, provided that this work relates to deep reinforcement learning. You may discuss the topic of
your final project with course staff by email, private message in Piazza, or in office hours. If you are not sure
about the topic, we encourage you to speak with us. There are few directions, each have its own checkpoints.
Multiagent RL

Category:

Description

5/5 - (3 votes)

CSE4/510: Introduction to Reinforcement Learning

Final Course Project
Applied Deep Reinforcement Learning
Multiple deadlines, please see below
Description
The goal of the Final Course Project is to explore advanced methods and/or applications in reinforcement
learning. You will be expected to prepare a proposal, milestone report, final report, and final presentation.
All projects should evaluate novel ideas that pertain to deep RL or its applications. The project must involve
reinforcement learning algorithms. You are encouraged to use your ongoing research work as a project in
this course, provided that this work relates to deep reinforcement learning. You may discuss the topic of
your final project with course staff by email, private message in Piazza, or in office hours. If you are not sure
about the topic, we encourage you to speak with us. There are few directions, each have its own checkpoints.
Multiagent RL
Building and solving multiagent tasks (including but not limited to agents communications, transportation
problems, multi-agent cooperation, etc) – all might potentially lead to a research project.
Steps:
1. Building a multiagent environment from scratch (can be an extension of your work at Assignment 1).
2. Solving the environment using any tabular methods
3. Solving the environment using any deep RL methods (DQN, DDQN, AC, A2C, DDPG, TRPO, PPO,
etc) and compare the results
Checkpoints:
• Solving the environment using any tabular methods
RC Cars
Setting up the simulator, training the cars in the simulator, applying results on the real RC cars. These
steps may require prior knowledge in robotics or autonomous vehicles.
Steps:
1. Install and explore the DeepRacer RC cars simulator
2. Check existing solutions and apply any RL methods to teach the car to drive in the simulator
1
3. Apply learnt knowledge in the real RC cars (with the ultimate goal of making a car move forward using
only the RL algorithm)
Checkpoints:
• Apply any RL methods to teach the car to drive in the simulator
Exploring Deep RL Algorithms
Explore recent advances in RL. This may include solving ANY of the below environments using deep RL
algorithms.
Possible environments include:
• Google Research Football Environment [blog post, github, includes participation in the tournament]
• MALMO (platform built on top of Minecraft) [github]
• Robotics by OpenAI [details, blog post]
• Atari by OpenAI [details]
Steps:
1. Set up the environment
2. Check the existing baseline methods applied to solve it
3. Apply deep RL to improve the results
Checkpoints:
• Check existing baseline methods applied to solve it
You can propose your own topic, thus you will get individual checkpoint.
If you get interesting results, we would encourage to share your project with the public in terms of
participating in the CSE Demo Days, or some other events, so it would be beneficial, if you choose topic that
you are really interested in.
If you do not know what to choose – go with Exploring Deep RL Algorithms on Atari OpenAI.
You may also come with your topic proposal. Please talk to Alina [[email protected]].
Registering your team
Deadline: March 10
Google Form link will be added later.
Writing the proposal
Deadline: March 13
The project proposal should be a one page single-spaced extended abstract motivating and outlining the
project you plan to complete. You proposal should have the following structure:
1. Topic
2. Objective. Explain the objective of the project and why that objective is relevant and important.
2
3. Related Work. Briefly review the most relevant prior work, and highlight where those works fall short
of meeting the objectives described above.
4. Technical Outline. Explain your approach at a high-level, making clear the novel technical contribution.
What environment and algorithm you are planning to use.
Before submitting, your proposal should be approved by any of the course staff.
Submitting the checkpoint
Deadline: April 10
Each direction have individual expectations for the middle checkpoint. If you do your own project – the
checkpoint has to be confirmed during the proposal.
Submitting the Project
Deadline: May 1
Complete your project in either Jupyter Notebook or python script. In your report include:
• The main motivation of your project (Why is it important/novel?)
• Preliminary materials (Discuss the algorithms, some background info you need to know)
• Implementation details
• Your results
Present your work
Presentation Days: will be added later
Present your work during the Presentation Days. Registration slots will be available around a week prior to
dates. The whole team should present the work. Note: your presentation should represent the work you
have submitted. If you take part in CSE Demo Days, you will make a short presentation during that day.
Presentation details
Length: 10 mins + followup questions
Presentation Templates: UB branded ppt templates or UB CSE PowerPoint template
Suggested presentation structure:
– Project Title / Team’s Name / Course / Date [1 slide]
– Project Description [1 slide]
– Background [max 2 slides]
– Implementation [max 2 pages]
– Results (Graphs & Any Visuals) [max 4 slides]
– Key Observations / Summary [1 slide]
– Thank you Page [1 slide]
Important Information
This project can be done in a team (up to three people) or individually. The standing policy of the Department
is that all students involved in an academic integrity violation (e.g. plagiarism in any way, shape, or form)
will receive an F grade for the course.
3
Late Days Policy
If you are working in a team, the max number of late days left for any of your teammates can be used. Thus
if one teammate is left with 3 days and another has 5 days left, your team has 5 days that can be used for
late day submission without penalty. Please note that final submission of the project has a hard deadline.
Important Dates
March 10, 11:59pm – Register your team
March 13, 11:59pm – Approve your project proposal with any of the course staff
April 10, 11:59pm – Checkpoint is due
May 3, 11:59pm – Project is Due
4