구글 코랩(Colab)에 패키지 영구 설치하기(한 번 설치한 패키지 재사용)


colab을 사용하다보면 디폴트로 설치되지 않은 패키지가 필요한 경우가 있다.(ex. mecab) 이런 경우 구글드라이브를 마운트하고 symlink를 설정해주면 추후에 다시 패키지를 설치하지 않아도 바로 import 하여 사용할 수 있다.

import os, sys
from google.colab import drive
drive.mount('/content/drive')

먼저 구글드라이브를 마운트 한 후

my_path = '/content/notebooks'
os.symlink('/content/drive/MyDrive/Colab Notebooks/my_env', my_path)
sys.path.insert(0, my_path)

구글드라이브 Colab Notebooks에 my_env 폴더를 만들고 symlink를 설정한다. 여기서 MyDrive는 띄어쓰기 없이 사용해야한다. 참고한 블로그에서는 My Drive로 되어 있어서 처음 시도했을 때 에러가 발생했다.

# 테스트용 코드
!pip install --target=$my_path jdc

이제 코랩을 새로 실행시켰을 때 구글드라이브를 마운트 하고 symlink를 설정해주면 my_env에 설치된 패키지를 별도 설치 없이 import 할 수 있다.

import os, sys
from google.colab import drive
drive.mount('/content/drive')

my_path = '/content/notebooks'
os.symlink('/content/drive/MyDrive/Colab Notebooks/my_env', my_path)
sys.path.insert(0, my_path)

import jdc

[Update] 2021.10.24

statsmodels 패키지로 테스트해보니 디펜던시 문제가 발생한다.

pip install --target=$my_path statsmodels 
# ModuleNotFoundError: No module named 'statsmodels.regression.rolling'
Collecting statsmodels
  Downloading statsmodels-0.13.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)
     |████████████████████████████████| 9.8 MB 5.2 MB/s 
Collecting patsy>=0.5.2
  Downloading patsy-0.5.2-py2.py3-none-any.whl (233 kB)
     |████████████████████████████████| 233 kB 54.3 MB/s 
Collecting pandas>=0.25
  Downloading pandas-1.3.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.3 MB)
     |████████████████████████████████| 11.3 MB 29.2 MB/s 
Collecting numpy>=1.17
  Downloading numpy-1.21.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
     |████████████████████████████████| 15.7 MB 57 kB/s 
Collecting scipy>=1.3
  Downloading scipy-1.7.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (28.5 MB)
     |████████████████████████████████| 28.5 MB 49 kB/s 
Collecting pytz>=2017.3
  Downloading pytz-2021.3-py2.py3-none-any.whl (503 kB)
     |████████████████████████████████| 503 kB 47.5 MB/s 
Collecting python-dateutil>=2.7.3
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     |████████████████████████████████| 247 kB 54.8 MB/s 
Collecting six
  Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, pytz, python-dateutil, numpy, scipy, patsy, pandas, statsmodels
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.6.0 requires numpy~=1.19.2, but you have numpy 1.21.3 which is incompatible.
tensorflow 2.6.0 requires six~=1.15.0, but you have six 1.16.0 which is incompatible.
google-colab 1.0.0 requires pandas~=1.1.0; python_version >= "3.0", but you have pandas 1.3.4 which is incompatible.
google-colab 1.0.0 requires six~=1.15.0, but you have six 1.16.0 which is incompatible.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.
Successfully installed numpy-1.21.3 pandas-1.3.4 patsy-0.5.2 python-dateutil-2.8.2 pytz-2021.3 scipy-1.7.1 six-1.16.0 statsmodels-0.13.0
WARNING: Target directory /content/notebooks/numpy already exists. Specify --upgrade to force replacement.
WARNING: Target directory /content/notebooks/scipy.libs already exists. Specify --upgrade to force replacement.
WARNING: Target directory /content/notebooks/six-1.16.0.dist-info already exists. Specify --upgrade to force replacement.
WARNING: Target directory /content/notebooks/__pycache__ already exists. Specify --upgrade to force replacement.
WARNING: Target directory /content/notebooks/numpy.libs already exists. Specify --upgrade to force replacement.
WARNING: Target directory /content/notebooks/statsmodels-0.13.0.dist-info already exists. Specify --upgrade to force replacement.
WARNING: Target directory /content/notebooks/bin already exists. Specify --upgrade to force replacement.
WARNING: The following packages were previously imported in this runtime:
  [dateutil]
You must restart the runtime in order to use newly installed versions.

해결 방법은 좀 더 알아봐야할 것 같다.


참고자료:

https://teddylee777.github.io/colab/colab%EC%97%90%EC%84%9C-python%ED%8C%A8%ED%82%A4%EC%A7%80%EB%A5%BC-permanently-%EC%9D%B8%EC%8A%A4%ED%86%A8%ED%95%98%EB%8A%94-%EB%B0%A9%EB%B2%95