“There are decades where nothing happens; and there are weeks where decades happen.” ― Vladimir Lenin

基于大语言模型的AI在这个月带给人们的感受，用列宁的这句话概括再贴切不过了。作为普通人，去拥抱这些AI工具，就像会使用智能手机和搜索引擎；对人类来说，就像学会用电，学会用火。

%%{init: { 'logLevel': 'debug', 'theme': 'dark' } }%%
timeline
    title Era of AI comes in  2023
    2-7: Microsoft New Bing
    3-12: Open AI ChatGPT 90% cheaper
    3-15 : Open AI GPT-4
    3-16 : Microsoft Copilot : Midjourney V5 : Google PaLM API
    3-21 : Adobe FireFly : Nvdia GTC
    3-22 : Github Copilot X : Google Bard
    3-24 : Open AI ChatGPT Plugins

更新：4月以来，AI应用的新概念、新架构、新产品如寒武纪大爆炸一般涌现（AutoGPT 首当其冲），非人力所能穷举。这个网站收录了大量AI工具，本文也会持续更新笔者常用、觉得好用的工具。

For General Purpose

Open AI’s ChatGPT, and Plugins. GPT3.5 is free to use.
Microsoft’s NewBing. It’s said to be powered by GPT4 (internal version).
Google’s Bard.

Tips on chatting effectively:

Use Englisih.
Use precise verb.
Only one topic at a time.
No “thanks” and interrupt in time.
Use role-play. Act as a travel guide. ...
Chat to multiple GPT instances with slightly different views, as if they are expert team.

After trying many LLM, ChatGPT is still the best one to be professional and smart. But I still prefer asking different models to get different points of view. Some common tips when asking:
Role play. act as .... Here is a collection of role-related prompts.
Give template input-output.
Tell chatgpt to anwser step by step.

For Doc

Edge + NewBing. Explain any webpage (including PDF) side by side.
ChatDoc/ChatPDF, upload PDF and analyze.
⏳Microsoft’s Copilot.

For Software Development

Github Copilot (based on OpenAI’s CODEX), costs $10/mo after 60d trial.
- Github Copilot Chat, previous named Github Copilot X, based on GPT4. Integrated to many IDE as plugins.
Updated on December 4, 2023. Use @workspace to query information about your VS Code workspace! Use #file:xxx to inquire about a specific file! Analyze terminal errors by clicking the “sparkling” icon! Summarize PR (Pull Request) for you! Ask questions by speaking, via the VS Code Speech plugin.
phind, the AI search engine for developers.
Cursor editor, or vscode plugin CodeCursor, read/write current document/code, FREE to try.
⏳Copilot for Docs, used to learn a SDK/framework/API, can based on private content.

The gist to generate code is, to describe a single-responsibility function to let AI generate, rather than a function with long description of chained operations.

For 3D/2D Art

Stable-Diffusion (SD) web-ui, totally free and opensource, run model locally on PC.
- Download/Share models on civitai/Hugging face
- Use ControlNet (Github )to add more controll on specific SD model.
- Use LoRA (Low-rank adaption) to train faster with less memory.
- Use Text Inversion to train with amazingly small output.
- Use [DreamBooth] to train if you need to be really expressive.
Midjourney, famous for its artistic style, ~~25 times FREE try~~.
- Built on Discord Bot, thus you can use Official API or thirdparty lib to automate the flow.
Adobe’s Firefly
Open AI’s DALL-E-2, generates image with natural language and long prompts, but limited-access and less control.
Bing’s Image Creator, generate image with natural language, and free to try.
[Only preview] styledrop
[Only preview] muse

For Music

Want more power?

If you want to:

train your own AI based on these models
know the strength and weakness of current AI models
know why & how Generative AI works, mathematically

Here are my personal ideas:

For text, play with LLaMA/llama.cpp, or its fined tuned version Alpaca/Alpaca-LoRA.
For image, play with Stable-Diffusion and its plugins. They can run on PC/Mac.
Weakness of current LLM models: math; chain of decision. But they are improving.
“Dive into Deep Learning” by 李沐。中文版《动手学深度学习》
Hardware considerations
- Training on cloud is cheaper and least effort to start. (Google’s Colab is even FREE)
- Training on local hardware, if use multiple GPUs (with NVLink), traffic bandwidth between GPUs is the botthleneck. (DGX A100 specs: 8xA100 GPUs, total 640GB VRAM, 600GB/s GPU-to-GPU bandwidth.)

For General Purpose#

For Doc#

For Software Development#

For 3D/2D Art#

For Music#

Want more power?#

For General Purpose

For Doc

For Software Development

For 3D/2D Art

For Music

Want more power?