langchain-chatglm

one click installation langchain-ChatGLM

Recently, open source big models are emerging, very hot. A lot of small partners want to try, but the big model of local construction, even if it is only to do inference also need strong performance machine, not to mention the fine-tuning and full parameter training, no hundreds of thousands of graphics card investment is not possible.

Recently, open source big models are emerging, very hot. A lot of small partners want to try, but the big model of local construction, even if it is only to do inference also need strong performance machine, not to mention the fine-tuning and full parameter training, no hundreds of thousands of graphics card investment is not possible. chatGLM reasoning can be run on the cpu, but it is not recommended, the speed is very slow. 6b model is recommended to 12G graphics card, the best 16G or more (int4 seems to be as long as the 6G). On the market, a 16G video card is about 1w. 24G is about 1w5. If you are not specialized in modeling, it is not recommended to buy, but it is good to buy to play 3A masterpiece, haha. Algorithms white people recommend renting a video card, simply experience on the line. So many arithmetic platforms on the market, you may compare the price, according to their own needs to choose, I use autoDL here.

I. Registration

First go to AutoDL-quality GPU rental platform-rent GPU on AutoDL official website to register an account, and then real-name authentication, otherwise you can not use the browser to access your service URL.

autodl relatively cheap, billed by traffic. 28G graphics card about 1.18 yuan / hour, if the authentication of the student seems to have free time, I heard a lot of student party with it to do the Bishop.

Second, choose the server

and other cloud services page is more like, I heard that the graphics card need to grab (do not recommend social animals to grab, affecting the graduation of other poor students is not good), choose the model you need, if () is 0, is no more.

Here is a very convenient function, you can choose some community mirrors, directly with, without complex configuration, very convenient. You can also use the basic mirror, the default installation of some tools, you can also make their own installed services into a mirror.

After selecting a good mirror, click the “Create Now” button on the line.

Third, start the login service

After starting, you can log in to the server. According to their own habits to choose the link, the simplest is to use its own jupyterlab web version, it is very convenient to upload and download files. I use secureCRT, just copy the ssh login command.

Note that secureCRT must be 8.0 or above, otherwise it is not supported.

Then execute three commands in the terminal, it’s very simple: (if you don’t know what it means, look it up, it’s very simple)

$cd /root/langchain-ChatGLM/

$conda activate /root/pyenv

$python webui.py

Execute the last command and you will see the following message, which means that the startup was successful!

Finally, click “Customize Service”, automatically jump to the browser, you can experience your own installation of the big model in the browser.

Use

  1. You can directly on the right side of the choice of llm dialog, so is the trained model, the experience is not bad. By the way, molestation, haha.
  2. You can also build your own knowledge base Q&A, upload your own md, pdf, docx, txt. note that the file encoding is best utf-8, the file English name, otherwise it will not be processed.

I uploaded txt and pdf documents, experience, the effect is general, I may give the document quality is not good, this feature remains to be improved.

  1. On the page can also choose the model to load, there are five, but be sure to leave enough hard disk space, change the model, the original model data will not be deleted.

Model in the reasoning, especially in the knowledge base quiz, sometimes the GPU memory utilization will reach 99%, sometimes suddenly memory burst, it is recommended to manually process the document.

Summarize:

This article is just a simple installation, a lot of places need to be researched and optimized, interested can go to the official website to see. Officials are updating the code every day, modify the bugs, and constantly improve.

Have always thought that training costs money, I did not expect reasoning is not cheap, I did not do the stress test, I guess a machine can not support a few people concurrently with.

Some other questions:

  1. How to update the chatglm model file

/root/model/delete chatglm-6b folder

/root/autodl-tmp/reclone under

After success, copy it to /root/model.

  1. How to update the repository code (because of the direct use of the mirror, the code is older, there are some bugs)

$cd /root/langchain-ChatGLM/

$rm -rf ~/.gitconfig

$git pull

Remove conflicting files

Run git pull again

Update the virtual environment dependencies, note that you must switch to the virtual environment

$conda activate /root/pyenv

$pip install -r requirements.txt

Modify the configuration file

Modify the web.py file to include the port for the autodl startup service.

Modify two paths in langchain-ChatGLM/configs/model_config.py file, the path to the server model file, if you don’t modify it, start web.py will re-download the model file, the disk will explode.

Restart web.py and it will work.

Leave a Reply