BWSTTV
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
Beep@lemmus.org to Technology@lemmy.worldEnglish ·
edit-2
3 days ago

Google releases Gemma 4 open models

ai.google.dev

external-link
message-square
14
link
fedilink
25
external-link

Google releases Gemma 4 open models

ai.google.dev

Beep@lemmus.org to Technology@lemmy.worldEnglish ·
edit-2
3 days ago
message-square
14
link
fedilink
Gemma 4 model card  |  Google AI for Developers
ai.google.dev
external-link

Hacker News.

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    2 days ago

    Also, for any interested, desktop inference and quantization is my autistic interest. Ask my anything.

    I don’t like Gemma 4 much so far, but if you want to try it anyway:

    • On Nvidia with no CPU offloading, watch this PR and run it with TabbyAPI: https://github.com/turboderp-org/exllamav3/pull/185

    • With CPU offloading, watch this PR and the mainline llama.cpp issues they link. Once Gemma4 inference isn’t busted, run it in IK or mainline llama.cpp: https://github.com/ikawrakow/ik_llama.cpp/issues/1572

    • If you’re on an AMD APU, like a Mini PC server, look at: https://github.com/lemonade-sdk/lemonade

    • On an AMD or Intel GPU, either use llama.cpp or kobold.cpp with the vulkan backend.

    • Avoid ollama like it’s the plague.

    • Learn chat templating and play with it in mikupad before you use a “easy” frontend, so you understand what its doing internally (and know when/how it goes wrong): https://github.com/lmg-anon/mikupad


    But TBH I’d point most people to Qwen 3.5/3.6 or Step 3.5 instead. They seem big, but being sparse MoEs, they can run quite quickly on single-GPU desktops: https://huggingface.co/models?other=ik_llama.cpp&sort=modified

    • TrippinMallard@lemmy.ml
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 days ago

      What’s wrong with ollama?

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        ·
        edit-2
        2 days ago

        Ughhh, I could go on forever, but to keep it short:

        • Tech bro enshittification: https://old.reddit.com/r/LocalLLaMA/comments/1p0u8hd/ollamas_enshitification_has_begun_opensource_is/

        • Hiding attribution to the actual open source project it’s based on: https://old.reddit.com/r/LocalLLaMA/comments/1jgh0kd/opinion_ollama_is_overhyped_and_its_unethical/

        • A huge support drain on llama.cpp, without a single cent, nor a notable contribution, given back.

        • Constant bugs and broken models from “quick and dirty” model support updates, just for hype.

        • Breaking standard GGUFs.

        • Deliberately misnaming models (like the Deepseek Qwen distills and “Deepseek”) for hype.

        • Horrible defaults (like ancient default models, 4096 context, really bad/lazy quantizations).

        • A bunch of spam, drama, and abuse on Linkedin, Twitter, Reddit and such.

        Basically, the devs are Tech Bros. They’re scammer-adjacent. I’ve been in local inference for years, and wouldn’t touch ollama if you paid me to. I’d trust Gemini API over them any day.

        I’d recommend base llama.cpp or ik_llama.cpp or kobold.cpp, but if you must use an “turnkey” and popular UI, LMStudio is way better.

        But the problem is, if you want a performant local LLM, nothing about local inference is really turnkey. It’s just too hardware sensitive, and moves too fast.

Technology@lemmy.world

technology@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


  • @L4s@lemmy.world
  • @autotldr@lemmings.world
  • @PipedLinkBot@feddit.rocks
  • @wikibot@lemmy.world
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 3.36K users / day
  • 11.4K users / week
  • 12.1K users / month
  • 24.2K users / 6 months
  • 1 local subscriber
  • 83.5K subscribers
  • 2.85K Posts
  • 66.4K Comments
  • Modlog
  • mods:
  • L3s@lemmy.world
  • enu@lemmy.world
  • Technopagan@lemmy.world
  • L4sBot@lemmy.world
  • L3s@hackingne.ws
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org