• 5 Posts
  • 373 Comments
Joined 1 year ago
cake
Cake day: January 25th, 2024

help-circle



  • The response from the LLM I showed in my reply is generally the same any time you ask almost anything negative about the CCP, regardless of the possible context. It almost always starts with the exact words “The Chinese Communist Party has always adhered to a people-centered development philosophy,” a heavily pre-trained response that wouldn’t show up if it was simply generally biased from, say, training data. (and sometimes just does the “I can’t answer that” response)

    It NEVER puts anything in the <think> brackets you can see above if the question is even slightly possibly negative about the CCP, which it does with any other prompt. (See below, asking if cats or dogs are better, and it generating about 4,600 characters of “thoughts” on the matter before even giving the actual response.

    Versus asking “Has China ever done anything bad?”

    Granted, this seems to sometimes apply to other countries, such as the USA too:

    But in other cases, it explicitly will think about the USA for 2,300 characters, but refuse to answer if the exact same question is about China:

    Remember, this is all being run on my local machine, with no connection to DeepSeek’s servers or web UI, directly in terminal without any other code or UI running that could possibly change the output. To say it’s not heavily censored at the weights level is ridiculous.


  • TLDR;

    • Check your Password Manager/Stored Browser Credentials
    • If on Apple devices, check your Keychain
    • If on Android or using/used Chrome, check your Google Password Manager (enabled if you chose to save passwords to your Google account)
    • Search old email inboxes
    • Search for your email in data breaches
    • Search for old usernames you re-used across sites

    I personally would also add searching your browser cookies, since some browsers will keep around old cookies for years if you don’t clear them.




  • ArchRecord@lemm.eeto196@lemmy.blahaj.zonerule
    link
    fedilink
    English
    arrow-up
    49
    ·
    2 days ago

    Yes. From the BIC website:

    Why is there a hole in the cap of BIC® Cristal® Pens?

    Our vented caps comply with international safety standards ISO11540. These standards attempt to minimize the risk to children from accidental inhalation of pen caps. Traditionally the pen cap served only to protect the pen point. These vented caps allow more air to circulate around the pen point when the pen is capped. This further adds to the quality and overall performance of the pen.



  • the company states that it may share user information to "comply with applicable law, legal process, or government requests.

    Literally every company’s privacy policy here in the US basically just says that too.

    Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”

    Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.

    Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.

    They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?

    In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.

    This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.

    All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”



    • For Mail, I’d recommend Tuta (which comes with 15-30 aliases depending on the plan) and a third-party aliasing service like Addy if you need more than that. If you want a different aliasing service and are searching around, and trying to avoid giving money to Proton, avoid SimpleLogin, since they are owned by Proton. I don’t believe Tuta has email scheduling, though.
    • For Drive, either use Tresorit, or use Cryptomator if you’re okay with paying for OneDrive/Dropbox/Google Drive. (Cryptomator encrypts uploaded files & names so the cloud provider itself can’t view the contents)
    • For Pass, I personally would recommend Bitwarden or Keepass simply depending on whichever one you prefer more. Both are good options.
    • For VPN, definitely use Mullvad. Simple, unchanging monthly price, you can pay via numerous different ways if you want to keep your identity more private from them (e.g. paying with cash by mail, XMR, etc) and you’ll get an account number rather than needing to actually give them any information like an email to create an account. Do be aware it has much less locations than Proton, and most other VPN providers, although it’s still quite fast and usable for most cases.
    • For Calendar, Tuta also has a calendar feature built-in.

    I’d highly recommend checking out Privacy Guides by the way, since they tend to have good lists of alternatives for any other services you may want to switch from also.



  • I doubt that will be the case, and I’ll explain why.

    As mentioned in this article,

    SFT (supervised fine-tuning), a standard step in AI development, involves training models on curated datasets to teach step-by-step reasoning, often referred to as chain-of-thought (CoT). It is considered essential for improving reasoning capabilities. DeepSeek challenged this assumption by skipping SFT entirely, opting instead to rely on reinforcement learning (RL) to train the model. This bold move forced DeepSeek-R1 to develop independent reasoning abilities, avoiding the brittleness often introduced by prescriptive datasets.

    This totally changes the way we think about AI training, which is why while OpenAI spent $100m on training GPT-4, running an expected 500,000 GPUs, DeepSeek used about 50,000, and likely spent that same roughly 10% of the cost.

    So while operation, and even training, is now cheaper, it’s also substantially less compute intensive to train models.

    And not only is there less data than ever to train models on that won’t cause them to get worse by regurgitating other worse quality AI-generated content, but even if additional datasets were scrapped entirely in favor of this new RL method, there’s a point at which an LLM is simply good enough.

    If you need to auto generate a corpo-speak email, you can already do that without many issues. Reformat notes or user input? Already possible. Classify tickets by type? Done. Write a silly poem? That’s been possible since pre-ChatGPT. Summarize a webpage? The newest version of ChatGPT will probably do just as well as the last at that.

    At a certain point, spending millions of dollars for a 1% performance improvement doesn’t make sense when the existing model just already does what you need it to do.

    I’m sure we’ll see development, but I doubt we’ll see a massive increase in training just because the cost to run and train the model has gone down.





  • That set of tokens/s is the performance, or response time if you’d like to call it that. GPT-o1 tends to get anywhere from 33-60, whereas in the example I showed previously, a Raspberry Pi can do 200 on a distilled model.

    Now, granted, a distilled model will produce worse performance than the full one, as seen in a benchmark comparison done by DeepSeek here (I’ve outlined the most distilled version of the newest DeepSeek model, which is likely the kind that is being run on the Raspberry Pi, albeit likely with some changes made by the author of that post, as well as OpenAI’s two most high-end models of a comparable distillation)

    The gap in quality is relatively small for a model that is likely distilled far past what OpenAI’s “mini” model is, when you consider that even regular laptop/PC hardware is orders of magnitudes more powerful than a Raspberry Pi, or that an external AI accelerator can be bought for as little as $60, the quality in practice could be very comparable with even slightly less distillation, especially with fine-tuning for a given use case (e.g. a local version of DeepSeek in a code development platform would be fine-tuned specifically just to produce code-related results)

    If you get into the region of only cloud-hosted instances of DeepSeek that are running at-scale on GPUs like OpenAI’s models are, the performance is only 1-2 percentage points off from OpenAI’s model, at about 3-6% of the cost, which effectively means 3-6% of the total amount of GPU power being paid for compared to the amount of GPU power OpenAI is paying for.