Rohan Rajesh Kalbag

OptiVLSI - A python library for fast, optimized VLSI CAD algorithms

Click for more details about project

In the realm of Very Large Scale Integration (VLSI), characterized by digital circuits comprising billions of transistors, the demand for computerized design automation, design verification, and testing algorithms is paramount. Digital circuits are typically represented as graphs, where logic gates serve as nodes, and their interconnections form the edges. Given the complexity of VLSI circuits, often involving millions of logic gates, a pressing need arises for swift and highly optimized graph algorithms. While existing optimized graph libraries like networkx exist, there is a noticeable gap in the availability of open-source libraries tailored specifically to the VLSI computer-aided design automation industry. In this project, a range of optimized algorithms and implementations were initially developed, including the Lee Algorithm, Kruskal’s Algorithm, Binary Decision Diagrams, Bellman-Ford Algorithm, Prim Algorithm, Dijkstra’s Algorithm, Compiled Code Simulator, and Event Driven Simulator. These implementations were meticulously documented for transparency and accessibility. To enhance performance, optimization tools such as Numba were employed to accelerate these algorithms, with systematic comparisons made against Pythonic and other conventional implementations. Furthermore, Automan was utilized to streamline simulations, benchmark algorithms, and evaluate results across a diverse range of circuits and graphs, each varying in size and complexity. This project underscores a dedicated commitment to advancing the field of VLSI design by offering optimized algorithms and leveraging cutting-edge tools to meet the specific demands of the VLSI CAD industry.

Accelerated Implementation of Kohonen Self Organising Map for Remote Sensing of Satellite Images

Click to access source files

This project showcases an optimized Numba-JIT accelerated Pythonic implementation of the Kohonen Self-Organizing Map (SOM) with customizable grid matrix sizes, designed for multispectral satellite image processing. The system takes multispectral satellite images as input and generates coded images using the trained SOM as a codebook, all conveniently packaged as a Python executable. Furthermore, the project includes image restoration capabilities, enabling a comparison with the original image, and provides vivid data visualization through informative plots. The implementation allows users to set parameters such as the SOM dimensions, initial learning rate, maximum iterations, and neighborhood function spread factor for fine-tuned control during execution.

IoT-based Biogas Plant Health Monitoring

Click for more details about project

An IoT-based monitoring system for a biogas plant that measures and displays parameters such as gas concentrations, humidity, and temperature. The design consists of a tailor-made custom in-house designed two-layer printed circuit board interfacing gas sensors in a modular fashion using I2C. The data is sent to a central server and stored in a cloud database for remote access and analysis through a web application. The system helps optimize biogas production, identify issues, improve safety, and offers flexibility to integrate additional gas sensors. Furthermore, we successfully addressed anticipated problems during development and prototyping stages and submitted a complete market-ready product

This project was recognised with Best Project Award for Electronics Design Lab at IIT Bombay

IITB-RISC-2022

The IITB-RISC-22, a 16-bit computer system, boasts a remarkable turing complete ISA, an architecture capable of executing 17 instructions, featuring 8 general-purpose registers (R0 to R7), with R7 doubling as the program counter. The architecture also incorporates a carry flag and a zero flag, along with two 16-bit Arithmetic Logic Units (ALUs), a 16-bit priority encoder providing a 16-bit output and 3-bit register address. The system further encompasses two Sign Extenders, SE6 and SE9, tailored for 6 and 9-bit inputs, respectively, yielding 16-bit outputs. Complementing these, there are two left bit shifters, Lshifter7 and Lshifter1, which respectively perform left shifts of 7 and 1 bit(s), appending zeros to the right, resulting in 16-bit outputs. Additionally, the architecture includes four temporary registers, TA, TB, TC, and TD, where TA, TB, and TC are 16-bit, and TD is 3-bit. This comprehensive design is complemented by a 128-byte (64-word addressable) random access memory.

In a significant development effort, a six-stage pipelined version of the processor was meticulously crafted. The entire instruction set was rigorously tested by loading them into memory, and waveforms were verified on the Xilinx Spartan 6 FPGA. To elevate processor performance, the ALU operations were expanded to encompass a two-way-fetch out-of-order superscalar architecture, boosting instruction per cycle (IPC) rates. This enhancement introduced a re-order buffer, a reservation station, and the implementation of Tomasulo’s register renaming algorithm, thereby streamlining the processor’s execution and optimizing its overall efficiency.

Implementation of Stop-and-Wait Algorithm

Click for more details about project

This project involved the development of a UDP-based Stop-and-Wait algorithm for reliable data transfer. The sender created UDP sockets for communication with the receiver. It meticulously managed packet transmission and retransmission based on acknowledgments from the receiver, ensuring data integrity. The code featured a sophisticated timeout mechanism, utilizing the select() function for precise timeout management. This approach enabled the sender to monitor responses and handle packet loss scenarios effectively. On the receiver side, the system continuously listened for incoming packets. It introduced a probabilistic packet drop simulation mechanism, where packets were randomly dropped based on a user-defined probability, enabling the examination of the protocol’s robustness in the face of potential data loss. The receiver meticulously tracked the sequence numbers of incoming packets, allowing it to acknowledge correctly received packets and request retransmissions when necessary. This project delved into the intricacies of network communication, demonstrating the implementation of a reliable communication protocol with advanced timeout handling and packet loss simulation.

Implementation, Visualisation and Analysis of Circuit Partitioning Algorithms

Click for more details about project

Recognizing that VLSI design schematics often exceed the capacity of a single FPGA due to the finite number of programmable logic elements, this project adopts a graph-based approach to represent interconnections between circuit elements. Logic gates, LUTs, FFs, and other design entities were seamlessly modeled as graph nodes, while interconnections were depicted as edges. In cases involving multiple parallel interconnects, weighted graphs were employed for precision. This method facilitated the automation of dividing the design among multiple FPGAs using CAD, effectively transforming the challenge into a graph partitioning problem. This project implements three significant partitioning algorithms Kernighan-Lin, Clustering-Based, and Hagen-Kahng-EIG – using Python, meticulously identifying metrics to assess their performance and prioritize the minimization of FPGA interconnects. Leveraging co-optimization across these metrics, advanced cost functions were developed, demonstrating the project’s commitment to achieving optimal circuit partitioning solutions within the realm of VLSI CAD.

Optimised Multiply Accumulate Circuit using Dadda Multiplier Architecture

Click for more details about project

This project involves an optimized Multiply Accumulate Circuit, implemented using VHDL with a Dadda Multiplier Architecture and a 16-bit Brent Kung Adder. It effectively multiplies two 8-bit operands and adds a 16-bit number to the product. The hardware descriptions were tested and their simulation was executed using GHDL, offering comprehensive test reports. The submission encompasses all necessary files, including test scripts and waveform analysis tools. Detailed results and RTL synthesis for FPGA was performed using Intel Quartus

Symptotic Analysis of Autoencoder Architectures for Image Colorization and Noise Reduction

Click for more details about project

In this study, the project embarked on leveraging deep learning to address image colorization and noise reduction challenges. The centerpiece of this project was the deployment of a Convolutional Neural Network (CNN)-based autoencoder, meticulously trained on grayscale CIFAR-10 images, achieving a Root Mean Square Error (RMSE) score of 0.052 for generating colorized versions. Further investigations encompassed a comparative analysis of the performance between autoencoders and Principal Component Analysis (PCA) in the context of gaussian and salt pepper noise reduction, employing training data from MNIST images. This analytical approach provided valuable insights into the strengths and limitations of these methodologies, shedding light on suitability of various noise reduction scenarios. Additionally, the project ventured into exploring the data specificity of autoencoders by executing the same model on different image classes, effectively illustrating how autoencoders adapt to diverse data sets.