Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs(Scheduler Architecture): update documentation of overview of scheduler #238

Merged
merged 1 commit into from
Jul 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/2024/scheduler/asset/c_arch.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/2024/scheduler/asset/golang_arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
142 changes: 111 additions & 31 deletions docs/2024/scheduler/index.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,111 @@
---
sidebar_position: 1
title: Introduction
slug: /2024/scheduler/
---
<!--
SPDX-License-Identifier: CC-BY-SA-4.0

SPDX-FileCopyrightText: 2024 Aditya Singh <email.here>
-->

## Author

[Aaditya Singh](https://github.com/aadsingh)

## Contact info

- [Email](mailto:email.here)
- [LinkedIn](https://linkedin.com/in/my-user)

## Project title

Scheduler overhaul

## What's the project about?

Insert Text Here

## What should be done?

What are the plans for the project?
---
sidebar_position: 1
title: Introduction
slug: /2024/scheduler/
---
<!--
SPDX-License-Identifier: CC-BY-SA-4.0

SPDX-FileCopyrightText: 2024 Aditya Singh <email.here>
-->

## Author

[Aaditya Singh](https://github.com/Aaditya-Singh78)

## Contact info

- [Email](mailto:singh.aaditya889@gmail.com)
- [LinkedIn](https://linkedin.com/in/aadi-singh)
- [Twitter](https://twitter.com/__Aadityasingh)

## Project title

Scheduler overhaul

## What's the project about?

This project aims to enhance the job scheduling capabilities of [FOSSology](https://github.com/fossology/fossology) by transitioning from a C-based implementation to a Go-based system. The overhaul focuses on leveraging Go's modern language features to improve concurrency, performance, and maintainability. This transition addresses the scalability and system *throughput* challenges in the current scheduler.


### Architecture Overview
![C-architecture](./asset/c_arch.jpg)

**The Current architecture** utilises the a multi-threaded approach to manage job scheduling & execution.It is structured around several key *components*:

1. **Main Thread**: Acts as the scheduler's control unit, managing worker threads and overseeing system operations like resource allocation and health monitoring.

2. **Job Execution Queue**: Holds and manages incoming job requests, facilitating efficient job processing and priority control.

3. **Worker Threads**: Executes jobs from the queue under the main thread’s management, optimizing resource use and performance.

4. **Scheduler Logic**: Determines the execution order of jobs based on priority and resource availability, ensuring systematic and efficient processing.

5. **Database Interaction**: Handles storage of job logs and results, supporting tracking, auditing, and data persistence.

6. **Error Handling Mechanism**: Manages job execution errors to ensure stability and prevent system-wide impacts from failures.

7. **Resource Allocation**: Distributes resources across jobs and threads to avoid contention and ensure efficient execution.

**Key Challenges**:

1. *Concurrency and Synchronization*: Ensuring that multiple worker threads operate without interfering with each other requires meticulous management of resources and synchronization.

2. *Efficiency and Throughput*: The system must optimize the processing of jobs to minimize wait times and maximize the throughput of the scheduler.

3. *Scalability*: As the number of jobs increases, the system must scale appropriately to handle the increased load without degradation in performance.

4. *Flexibility*: Adapting to varied job types and changing operational conditions while maintaining performance and reliability.

## What should be done?

What are the plans for the project?

1. **Refactor Existing Code**: Transitioning the existing C codebase to Go, restructuring components to fit the Go idiom.

> **Why Go?**

- *Concurrency and Performance*: Go's native goroutine and channel-based concurrency model is highly efficient for processes that require concurrent execution, which is critical for job scheduling.

- *Memory Safety*: Automatic memory management and garbage collection in Go reduce the risk of memory-related errors, a common challenge in C due to its manual memory handling.

- *Simplicity and Productivity*: Go's clean and concise syntax, along with its powerful standard library, enables rapid development and easier maintenance compared to the verbose and complex C code.

- *Robust Tooling*: The Go toolchain provides out-of-the-box support for testing, formatting, and documentation, enhancing development workflow and product quality.

- *Cross-Platform Compatibility*: Go simplifies the build process with its strong support for cross-platform compilation, making it easier to manage and deploy on various systems without code changes.

2. **Optimize Concurrency Handling**: Implementing a robust concurrency model using goroutines and channels to handle multiple jobs efficiently.

> **How it would be achieved ?**

- The *new scheduler architecture* will utilise:

![architecture](./asset/golang_arch.png)

- **Go Routines for Task Management**: Efficiently handling multiple jobs in parallel to optimize resource usage.

- **Channels for Communication**: Using channels to manage job queues and worker communication, ensuring thread-safe operations.

- **Modular Design**: Structuring the scheduler with clear separation of concerns, allowing for easier updates and maintenance.

- To ensure consistency and maintainability of the codebase, the following *coding standards* will be applied:

- *Format and Style*: using `gofmt` and `golint` for formatting and linting the code.

- *Error Handling*: Follow Go's idiomatic way of error handling. Always check for errors where they can occur and handle them gracefully.

- *Commenting and Documentation*: Write clear comments for all public functions and methods, using Godoc conventions. Document all packages and provide examples where necessary.

- *Concurrency Practices*: Use goroutines and channels appropriately. Avoid common pitfalls like race conditions by using synchronization primitives from the `sync` package when needed.

- *Testing*: Write comprehensive unit tests for all components using Go's built-in `testing` package. Aim for a high level of test coverage to ensure reliability and facilitate future changes.

3. **Enhance Error Handling**: Utilizing Go's built-in error handling to create a more reliable and fault-tolerant scheduler.

4. **Integrate with Existing Systems**: Ensuring the new Go-based scheduler integrates seamlessly with the current FOSSology ecosystem.

5. **Test and Deploy**: Thoroughly test the new system for performance and reliability before full deployment.

6. **Document the System**: Provide comprehensive documentation to support future development and use of the scheduler.