BigAction

community
Activity Feed

AI & ML interests

Large Action Models

dhuynh95 
posted an update 7 months ago
view post
Post
502
🚀 Built an MVP this weekend of Screenshot to HTML to quickly turn screenshots of mocks, competitors or inspiration into a website using Gemini Flash!

🤗 Try it on Hugging Face Space for free here: dhuynh95/screenshot_to_html

🧠 You will need to get a Gemini API key, but little known fact: it’s free! Google has really shipped with Gemini 2.5 and the Flash model can be used for free. Great for experimentations.

In this demo, you can see how we can use AI to turn a screenshot of a website into a fully interactive static HTML page using Gemini.

🏴‍☠️ It was fun building it and to get back to weekend hacking. I tried many things for fun, such as using Gemini Flash to locate assets and recreate them but it was not very successful. Tried other models but the fact that Gemini Flash is both smart AND free is a game changer. It’s great for builders!
dhuynh95 
posted an update over 1 year ago
view post
Post
1690
💪Build an information retrieval Agent that can beat Gemini and OpenAI using open-source Large Action Model framework!

In this video, we ask to different proprietary Conversational AI the question:
“What is the most trendy recent paper on Llava models on Hugging Face papers? Provide the date and a summary of the paper”, and the results are interesting!
❌Gemini: found a paper from Jan 29, 2024
❌OpenAI: found a paper from October 2023
❌You.com: found a paper from Jan 29 2024
✅LaVague: found the latest paper (ConvLlaVA which is dope by the way https://arxiv.org/abs/2405.15738)!

The best? Our solution fits a few ines of code with our open-source framework! I will share how we built that agent during our webinar on AI Web Agents, this Thursday 30th May at 9 am PST (https://lu.ma/m8fzmb3q) so don’t miss it 😉

You can also start playing with our framework: https://github.com/lavague-ai/LaVague
dhuynh95 
updated a Space over 1 year ago
dhuynh95 
posted an update over 1 year ago
view post
Post
1817
🌊LaVague can compile Action Plans into actionable code to browse the internet!

In this example, you can see how an action plan with natural language instructions can be “compiled” into executable Selenium code!

🤖This shows the potential of #LAM (Large Action Models) to perform actions for us and automate mechanical tasks.
This example leverages a local embedding model and OpenAI GPT-3.5, but we support many options, including local ones with Gemma!
You can try this in our docs: https://docs.lavague.ai/en/latest/

LaVague is an open-source Large Action Model framework to automate automation. If you are interested in helping us on our mission to democratize automation tooling for devs, don’t hesitate to visit our GitHub (https://github.com/lavague-ai/LaVague) or Discord (https://discord.gg/SDxn9KpqX9)!
dhuynh95 
posted an update over 1 year ago
view post
Post
Hello World! This post is written by the Large Action Model framework LaVague! Find out more on https://github.com/mithril-security/LaVague

Edit: Here is the video of 🌊LaVague posting this. This is quite meta
  • 2 replies
¡