SAWERA KHADIUM
Artificial Intelligence Engineer
I'm a passionate and dedicated AI enthusiast with 5+ years of experience in AI. I can implement anything from scratch, from 0 to 100. I have pioneered high-end projects in the field of Artificial Intelligence, including NLP, Computer Vision, Machine Learning, Data Analytics, Web Scraping, Generative AI and Backend.
FUN AI PROJECTS
๐ผ๏ธ Collected Dr. Strange illustrations
๐ Wrote a simple 3-4 line script with ChatGPT
๐๏ธ Turned the lines into audio using ElevenLabs
๐ฅ Animated each image with Runway Gen-3 Alpha Turbo
๐ Tried out Lip Sync on one character (mind-blowing results!)
๐ถ Added background music from Pixabay
๐ฌ Combined everything using Canva
Image to Video
Experimenting with Runway Gen-3 Alpha!
In this experiment I have utilized Runway Gen-3 Alpha turbo model, and tested its image to video capabilities by creating something fun.
Hereโs the glimpse of what I did:Check out the final result! ๐ซ
PuLID Model
I๐๐๐ง๐ญ๐ข๐ญ๐ฒ ๐๐จ๐๐ฎ๐ฌ๐๐ ๐๐ฆ๐๐ ๐ ๐ ๐๐ง๐๐ซ๐๐ญ๐ข๐จ๐ง ๐ฐ๐ข๐ญ๐ก๐จ๐ฎ๐ญ ๐ ๐ข๐ง๐ ๐ญ๐ฎ๐ง๐ข๐ง๐
Recently came across this really good open source PuLID opensource model tested this really powerful AI model for Identity based image generation on my own images which turned out to be really good.
Now About this model
PuLID is a tuning-free ID customization approach. PuLID maintains high ID fidelity while effectively reducing interference with the original modelโs behavior.
A single ID image is usually sufficient, you can also supplement with additional auxiliary images as well.
Add your favourite prompt with images and here you have it your own identity based images within seconds.
๐๐๐๐ฅ-๐๐๐๐๐๐๐ฌ Model
๐๐ญ๐ข๐ฅ๐ข๐ณ๐ ๐๐๐๐ฅ-๐๐๐๐๐๐๐ฌ ๐ญ๐จ ๐ฎ๐ฉ๐ฌ๐a๐ฅ๐ ๐ฏ๐ข๐๐๐จ ๐ช๐ฎ๐๐ฅ๐ข๐ญ๐ฒ
In this experiment I have utilized Youtube's very first video in 240p as inputChoose a GAN's model for example: RealESRGAN for now
๐ Step 1: Splitting Up the Video
First, we break the video into image frames.
๐ก Step 2: Process Each image Frame with GANs
Utilize GANs to work some magic on each image to upscale it.
๐พ Step 3: Saving the Improved image
After each image is upscaled, save it in original video sequence order.
๐ Step 4: Putting it All Back Together
Finally, put all the upscaled images back together to create a super-duper enhanced video!
Text to Image And Image to Image Models
Prompted multiple Text to Img models and Img to Img Models
I have implemented a stable diffusion model for image generation! Along the way, I discovered the importance of prompt engineering and thorough testing, not only for experimenting with image generation models but also for fine-tuned variations. Crafting precise prompts and meticulously analyzing the output allowed me to push the boundaries and achieve stunning visual results.
I have experimented with almost all image generation models in the market, especially Stable diffusion models, ControlNets etc, Dalle, Midjourney, Openjourney, LoRa etc.
Apart from this I have experience fine tuning these models as well.
PIFuHD Model
2D to 3D Human Realistic Mesh Generation
Used pretrained PIFuHD Model to generate 3D reconstructions of humans from their 2D images with state-of-the-art quality and detail results in pose transfer benchmarks can be applied to a variety of poses, viewpoints, scale, and severe occlusions. This can be used in multiple gaming & metaverse applications for 3D characters generation and real time human representation.
Large Language Models (LLMs)
Finetune LLM GPT-2 Model on text data
This project fine-tunes a GPT-2 model using Hugging Face Transformers, enabling custom text generation. Users learn to load, train, and generate contextually relevant responses, gaining experience in leveraging advanced NLP models for various applications.
Stable Diffusion
AI Magic Avatars
Using stable diffusion model and finetune on my own images and generated Lensa like AI Magic Avatars.
Generative adversarial network (GANs)
HD image Quality Upscale
Used GANs model to upscale low quality image up to HD quality by 90% and increase resolution specifically face.
My Career Journey
I am a passionate and experienced AI Engineer with over 5 years of expertise in Python. I have a proven track record of developing and implementing innovative AI solutions, with a strong focus on Natural Language Processing (NLP), Computer Vision, Machine Learning, Data Science, Data Analytics, Backend Development and Generative AI.I keep myself up-to-date with latest AI tools and advancements. Specializing in delivering results swiftly and committed to getting things done quickly while improving user experiences, refining business solutions, and leading the way in AI innovation.Throughout my career, I have successfully implemented numerous cutting-edge AI solutions, using various AI technologies:๐๐๐ ๐ฉ๐ซ๐จ๐ฃ๐๐๐ญ๐ฌ such as personalized recommendation systems, chatbots, conversational AI, and named-entity recognition models, leveraging advanced techniques like prompt engineering, GPT-2, BERT, T5, GPT-3, ChatGPT, GPT-4, Auto-GPT, LLMs, RAG, Llava, Langchain, and Llama 2.
๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐๐ซ ๐ฏ๐ข๐ฌ๐ข๐จ๐ง ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง๐ฌ encompassing image classification, image processing, deep learning, object detection, and Wav2Lip with GANs on Videos.
๐๐๐๐ก๐ข๐ง๐ ๐ฅ๐๐๐ซ๐ง๐ข๐ง๐ ๐ฆ๐จ๐๐๐ฅ๐ฌ utilizing big data, deep learning, neural networks, and architectures like VGG16, VGG19, CNN, LSTM, RNN, and FastAI.
๐๐๐ญ๐ ๐๐ง๐๐ฅ๐ฒ๐ญ๐ข๐๐ฌ ๐ฌ๐จ๐ฅ๐ฎ๐ญ๐ข๐จ๐ง๐ฌ for hotels and stock price prediction, EBITDA calculation, graph visualizations, and ETL pipelines.
๐๐๐ ๐ฌ๐๐ซ๐๐ฉ๐ข๐ง๐ ๐ฉ๐ซ๐จ๐ฃ๐๐๐ญ๐ฌ targeting various websites to automate data collection.
๐๐ง๐ฌ๐ข๐ ๐ก๐ญ๐ฌ ๐๐ง๐ ๐ฆ๐๐ญ๐ซ๐ข๐๐ฌ ๐ ๐๐ง๐๐ซ๐๐ญ๐ข๐จ๐ง using tools like Retool, Airtable, Zapier, Power BI, and Google BigQuery.
๐๐๐๐ค๐๐ง๐ ๐๐๐ฏ๐๐ฅ๐จ๐ฉ๐ฆ๐๐ง๐ญ using Django and Flask to power AI-driven api's in web and mobile applications.
๐๐๐ง๐๐ซ๐๐ญ๐ข๐ฏ๐ ๐๐ ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง๐ฌ, including fine-tuned Stable Diffusion models for high-quality image generation, AI avatar creation, text-to-video and image-to-video models, and lip-sync on video with speech.
๐๐ ๐๐ก๐๐ซ๐๐๐ญ๐๐ซ ๐๐ซ๐๐๐ญ๐ข๐จ๐ง leveraging tools like CC4, IClone, MetaHuman, Unreal, LiveLink, Avaturn, Audio2Face, and NVIDIA Omniverse.
Apprenticeship
Image Forensics Tool PhotoChamp
Image Forensic Analysis tool entitled as PhotoChamp.
Python Based desktop application. PyQt user-friendly GUI.
Machine learning models trained on Big Data (4GB -12 GB) to detect Image Manipulation, Copy-move Forgery, and Image splicing.
Object Detection with machine learning car detection in Images with CNN and DNN neural networks respectively.
OpenCV Find and Compare difference module to differentiate the differences in images.
Module to download subtitles for videos.
Module to retrieve all Browsers history of browsers in a csv file. PDF viewer to view PDF file in forensic tool GUI. Media player to play multiple audios and video files in image forensics tool GUI.
AI Engineer
Personalized Recommender System
Built Personalized Natural Language Processing (NLP) recommendation system for SaaS app, including offline model for recommendations on mobile app. Built 3-4 microservices Integration with MongoDB and AWS S3 bucket.
Self-Covid Testing kit App
Built microservices for Self Covid19 testing kits app with ML model 99.72% accuracy for XanaMediTest, Achieved accuracy 99.72% with fast ai to classify the result of covid Testing kit as positive negative or invalid, Microservice to read QR code from kit and match with previous testing kit information which was entered, Microservice to generate pdf report of user with his/her covid results.
Bot to send Notification
Automated Bot to send email notification to user as soon as new crypto Token / Coin is listed on gate.io, no repetitive notification calls as well.
Live Camera background Removal/ Change
Bokeh effect on image/video background, live camera background removal, change, or blur.
Desktop application
Desktop application for ecommerce seen store for building, generating csv files of images data etc.
Web Scraping
Web-Scraping of websites and converting that into useful data, selenium, bs4 and api's.