Gen AI: audio generation and synchronization of imported photo

less than 1 minute read

Generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI

Pre-requisites:

Install below libraries from requirements.txt file

pip install -r requirements.txt

Design info:

used hugging face to consume ready made AI models.
for image-to-text with model as “(salesforce/blip-image-captioning-base)”
for text to audio with model as “kan-bayashi_ljspeech_vits”.
used langchain+Chat GPT to geenrate a text
published image to audio using streamlit

Build and run?

streamlit run app.py ## Image to Audio:

screenshot

Share on

X (formerly Twitter) Facebook LinkedIn

Configuring Wifi in ESP32 WORM using code

1 minute read

Published: June 15, 2024

Recently, I have been delving into a specific use case that involves consuming a voice REST endpoint using the ESP32 microcontroller. This task requires not only utilizing the capabilities of the ESP32 but also ensuring that the device is connected to a Wi-Fi network for seamless communication with the endpoint.

Data mocking using Faker

2 minute read

Published: May 31, 2024

Ideally, test data is of priority and the project teams always face an issue in getting the relevant and realistic test data for pre-production activities. More issues(refresh of data; data manipulations etc.,) arise, when programs consume data from a shared environment. Sometimes, requirements of data varies and a new set of data should be replicated through external tools and technologies. Many commercial data mocking/stubbing tools are available in the market, but as a open source lover, I recommend using Faker library.