Hiii, my name is
Mingyong Ma.
young man with love and passion
I am student intended to become a software developer.
1.
About Me
Click to unlock
2.
Where I've Worke
Software Enginee Intern @Adobe
June 19 - September 8 2023
click to unlock
- Building Adobe primary AI platform infrastructure, improve API call from sync to async, and reduce the network I/O from 1GB to 9MB per inference call. Using Jmeter for load-testing, able to generate 1600 TPS (token per second).
- Implementing using REST API that able to CRUD a specific task, and save it in postgres DB with almebic version control.
- Generating a docker container for each fine-tuning task, able to scheduling according to the load on Azure cloud.
- Facilitate the fine-tuning of FlanT5 and llama2 models, with the capability to independently store the base model and fine-tuning layer utilizing PEFT.
Software Enginee Intern @Amazon
July - August 2022
click to unlock
- Developed an image processing system that combines deep learning techniques with the Unsharp algorithm, achieving lower sharpness results compared to the camera algorithm used in tablets. And evaluated the performance of the system using MTF-50
- Demonstrated ability to compare the performance of algorithms by controlling the imaging device with adb and generating identical images with different image sharpening settings in the Amazon lab.
- Developed a user-friendly model for controlling sharpness manually or automatically.
- Tested classical image processing methods including Sobel, Canny operator, and Unsharp algorithm using Imatest software in the Amazon lab.
- Link on my doc
Data Enginee Intern @Lenovo
November 2021 - Febuary 2022
click to unlock
- Conducted time series forecasting to predict future sales of Lenovo's notebook products and tablets, utilizing Lenovo's historical sales data as well as data from other companies such as IDC and GFK.
- Increased the forecasting accuracy of the model by 1.2% by implementing machine learning algorithms such as Prophet and deep learning models like LSTM or GRU.
- Used Optuna to make hyperparameter adjustments to the project's existing code, which saved a large amount of time compared with the raditional grid search method.
3.
Some Things I've Build
Distribute Cloud File Management System
click to unlock
-
Implementing the http protocol (simple version):
- Clients send request messages to the server, and servers reply with response messages layered on top of the TCP protocol.
- Implemented HTTP persistent connection: a client can reuse a TCP connection to a given server
- Provide safe control not allowing clients to access memory other than document root.
-
creating a fault tolerant cloud-based file storage service called SurfStore (client and
server communicating using
gRPC):
- The SurfStore service is composed of the following two services:
- BlockStore: Stores these blocks, and when given an identifier, retrieves and returns the appropriate block.
- MetaStore: Manages the metadata of files and mapping of filenames to blocks (hash marshalled by SHA-256).
- The clients' file data is stored in local database with version. When invoking into client, the sync operation will occur, and new files added to base directory will be uploaded to the cloud, files that were sync'd to the cloud from other clients will be downloaded to base directory, and any files which have ''edit conflicts'' will be resolved.
- Store and manage the block in different BlockStore using Consistent Hashing Ring.
- Ensure that the MetaStore is fault tolerant and stays consistent regardless of minority of server failures by RAFT protocol. (Raft Simulator)
B+ Tree with Buffer Management
click to unlock
- Build a Buffer Pool on top of I/O layer with page frames and a hashtable which maps frame number to page number.
- A page frame has PinCnt(whether the page is being used), dirty bit(whether the page is modified), reference bit(used for LRU clock alogorithm).
- Implement buffer replacement policy and LRU clock algorithm.
- Build a B+ Tree on top of Buffer Pool layer, supporting INSERT/DELETE operation.
- Initialize B+ Tree with bulk loading and split a page when number exceed "fanout".
- B+ Tree support inserting int, double and char* variable.
Operating System Implementation
click to unlock
- Implementing internal structures of the operating system: Alarm() function to call timer interrupt; Join() function to sleep the parent while waiting for the child thread to finish; Implement semaphores to provide atomicity;
- create the pageTable data structure for each user process, which maps the process's virtual addresses to physical addresses.
- Implementing the file system calls create, open, read, write, close, unlink, join, exit and exec.
- Implement demand paging, page replacement to free up a physical page frame to handle page faults
Active Learning for data limited classification task
click to unlock
- Applied active learning to the classification task of malaria cells
- Used only 26% of the data to achieve comparable accuracy close to the 100% of data usage in other literature by applying uncertainty sampling.
- Explored who has a higher improvement effect on model generalization by comparing random sampling> with uncertainty sampling and Logistic Regression with SVM.
- Paper accepted by the 2021 IEEE 3rd International Conference on Frontiers Technology of Information and Computer.
DS analysis using spark
click to unlock
- Appling Spark to the analysis of TAVG measurements in India
- Used Spark Dataframe and interactive plotting.
- Link on my jupeter notebook