The gpu memory usage continues to increase after each round while finetuning LLM with an adapter. The gpu memory increment after each round was approximately the same. I speculate it's because that there are new clients joining in each round and there would be new model parameters. I've already set share_local_model and llm.adapter.mv_to_cpu to True, it should move the adapter to cpu after each round but why would the gpu memory still increase? I'd appreaciate it if anyone could help me with this issue. Thanks in advance!
The gpu memory usage continues to increase after each round while finetuning LLM with an adapter. The gpu memory increment after each round was approximately the same. I speculate it's because that there are new clients joining in each round and there would be new model parameters. I've already set share_local_model and llm.adapter.mv_to_cpu to True, it should move the adapter to cpu after each round but why would the gpu memory still increase? I'd appreaciate it if anyone could help me with this issue. Thanks in advance!