当前位置: 首页>后端>正文

测试机器学习系统:代码、数据和模型

## Intuition 鍦ㄦ湰璇句腑锛屽皢瀛︿範濡備綍娴嬭瘯浠g爜銆佹暟鎹拰妯″瀷锛屼互鏋勫缓鍙互鍙潬杩唬鐨勬満鍣ㄥ涔犵郴缁熴€傛祴璇曟槸纭繚鏌愪簺涓滆タ鎸夐鏈熷伐浣滅殑涓€绉嶆柟寮忋€傝婵€鍔卞湪寮€鍙戝懆鏈熶腑灏芥棭瀹炴柦娴嬭瘯骞跺彂鐜伴敊璇潵婧愶紝浠ヤ究鍙互闄嶄綆[涓嬫父鎴愭湰](https://assets.deepsource.io/39ed384/images/blog/cost-of-fixing-bugs/chart.jpg)鍜屾氮璐规椂闂淬€備竴鏃﹁璁′簡娴嬭瘯锛屽彲浠ュ湪姣忔鏇存敼鎴栨坊鍔犲埌浠g爜搴撴椂鑷姩鎵ц瀹冧滑銆? > tip > > 寮虹儓寤鸿鎮ㄥ湪瀹屾垚涔嬪墠鐨勮绋媉鍚巁鎺㈢储鏈绋嬶紝鍥犱负涓婚锛堝拰浠g爜锛夋槸杩唬寮€鍙戠殑銆備絾鏄紝纭疄鍒涘缓浜?[testing-ml](https://github.com/GokuMohandas/testing-ml)瀛樺偍搴擄紝鍙€氳繃浜や簰寮弉ote鏈揩閫熸瑙堛€? ### 娴嬭瘯绫诲瀷 鍦ㄥ紑鍙戝懆鏈熺殑涓嶅悓闃舵浣跨敤浜嗗洓绉嶄富瑕佺被鍨嬬殑娴嬭瘯锛? 1. `Unit tests`锛氬姣忎釜鍏锋湁[鍗曚竴鑱岃矗](https://en.wikipedia.org/wiki/Single-responsibility_principle)鐨勫崟涓粍浠惰繘琛屾祴璇曪紙渚嬪杩囨护鍒楄〃鐨勫姛鑳斤級銆? 2. `Integration tests`锛氭祴璇曞崟涓粍浠剁殑缁勫悎鍔熻兘锛堜緥濡傛暟鎹鐞嗭級銆? 3. `System tests`锛氬缁欏畾杈撳叆锛堜緥濡傝缁冦€佹帹鐞嗙瓑锛夌殑棰勬湡杈撳嚭鐨勭郴缁熻璁¤繘琛屾祴璇曘€? 4. `Acceptance tests`锛氱敤浜庨獙璇佹槸鍚︽弧瓒宠姹傜殑娴嬭瘯锛岄€氬父绉颁负鐢ㄦ埛楠屾敹娴嬭瘯 (UAT)銆? 5. `Regression tests`锛氬熀浜庝箣鍓嶇湅鍒扮殑閿欒鐨勬祴璇曪紝浠ョ‘淇濇柊鐨勬洿鏀逛笉浼氶噸鏂板紩鍏ュ畠浠€? 铏界劧 ML 绯荤粺鏈川涓婃槸姒傜巼鎬х殑锛屼絾瀹冧滑鐢辫澶氱‘瀹氭€х粍浠剁粍鎴愶紝鍙互浠ヤ笌浼犵粺杞欢绯荤粺绫讳技鐨勬柟寮忚繘琛屾祴璇曘€傚綋浠庢祴璇曚唬鐮佽浆鍚戞祴璇昜鏁版嵁](https://franztao.github.io/2022/10/01/Testing//./#data)鍜孾妯″瀷](https://franztao.github.io/2022/10/01/Testing//./#models)鏃讹紝娴嬭瘯 ML 绯荤粺涔嬮棿鐨勫尯鍒氨寮€濮嬩簡銆? ![娴嬭瘯绫诲瀷](https://upload-images.jianshu.io/upload_images/27840083-744c45174eee9b23.png) > 杩樻湁璁稿鍏朵粬绫诲瀷鐨勫姛鑳藉拰闈炲姛鑳芥祴璇曪紝渚嬪鍐掔儫娴嬭瘯锛堝揩閫熷仴搴锋鏌ワ級銆佹€ц兘娴嬭瘯锛堣礋杞姐€佸帇鍔涳級銆佸畨鍏ㄦ祴璇曠瓑锛屼絾鍙互鍦ㄤ笂闈㈢殑绯荤粺娴嬭瘯涓鎷墍鏈夎繖浜? ### 搴旇濡備綍娴嬭瘯锛? 缂栧啓娴嬭瘯鏃朵娇鐢ㄧ殑妗嗘灦鏄痆Arrange Act Assert](http://wiki.c2.com/?ArrangeActAssert)鏂规硶銆? - `Arrange`锛氳缃笉鍚岀殑杈撳叆杩涜娴嬭瘯銆? - `Act`锛氬皢杈撳叆搴旂敤鍒拌娴嬭瘯鐨勭粍浠朵笂銆? - `Assert`锛氱‘璁ゆ敹鍒颁簡棰勬湡鐨勮緭鍑恒€? > `Cleaning`鏄鏂规硶鐨勯潪瀹樻柟绗洓姝ワ紝鍥犱负閲嶈鐨勬槸涓嶈鐣欎笅鍙兘褰卞搷鍚庣画娴嬭瘯鐨勫厛鍓嶆祴璇曠殑娈嬬暀鐗┿€傚彲浠ヤ娇鐢╗pytest-randomly](https://github.com/pytest-dev/pytest-randomly)绛夊寘閫氳繃闅忔満鎵ц娴嬭瘯鏉ユ祴璇曠姸鎬佷緷璧栨€с€? 鍦?Python 涓紝鏈夎澶氬伐鍏凤紝渚嬪[unittest](https://docs.python.org/3/library/unittest.html)銆乕pytest](https://docs.pytest.org/en/stable/)绛夛紝鍙互璁╁湪閬靛畧_Arrange Act Assert_妗嗘灦鐨勫悓鏃惰交鏉惧疄鐜版祴璇曘€傝繖浜涘伐鍏峰叿鏈夊己澶х殑鍐呯疆鍔熻兘锛屼緥濡傚弬鏁板寲銆佽繃婊ゅ櫒绛夛紝鍙互澶ц妯℃祴璇曡澶氭潯浠躲€? ### 搴旇娴嬭瘯浠€涔堬紵 鍦╛瀹夋帓_杈撳叆鍜宊鏂█_棰勬湡杈撳嚭鏃讹紝搴旇娴嬭瘯杈撳叆鍜岃緭鍑虹殑鍝簺鏂归潰锛? - **杈撳叆**锛氭暟鎹被鍨嬨€佹牸寮忋€侀暱搴︺€佽竟缂樻儏鍐碉紙鏈€灏?鏈€澶с€佸皬/澶х瓑锛? - **杈撳嚭**锛氭暟鎹被鍨嬨€佹牸寮忋€佸紓甯搞€佷腑闂村拰鏈€缁堣緭鍑? > [馃憠 灏嗗湪涓嬮潰浠嬬粛涓庢暟鎹甝(https://franztao.github.io/2022/10/01/Testing//./#data)鍜孾妯″瀷](https://franztao.github.io/2022/10/01/Testing//./#models)鏈夊叧鐨勬祴璇曞唴瀹圭殑鍏蜂綋缁嗚妭銆? ## 鏈€浣冲疄璺? 涓嶇浣跨敤浠€涔堟鏋讹紝灏嗘祴璇曚笌寮€鍙戣繃绋嬬揣瀵嗙粨鍚堟槸寰堥噸瑕佺殑銆? - `atomic`锛氬湪鍒涘缓鍑芥暟鍜岀被鏃讹紝闇€瑕佺‘淇濆畠浠叿鏈塠鍗曚竴鐨勮亴璐(https://en.wikipedia.org/wiki/Single-responsibility_principle)锛屼互渚垮彲浠ヨ交鏉惧湴娴嬭瘯瀹冧滑銆傚鏋滄病鏈夛紝闇€瑕佸皢瀹冧滑鎷嗗垎鎴愭洿缁嗙矑搴︾殑缁勪欢銆? - `compose`锛氬綋鍒涘缓鏂扮粍浠舵椂锛屽笇鏈涚紪鍐欐祴璇曟潵楠岃瘉瀹冧滑鐨勫姛鑳姐€傝繖鏄‘淇濆彲闈犳€у拰鍙婃棭鍙戠幇閿欒鐨勫ソ鏂规硶銆? - `reuse`锛氬簲璇ョ淮鎶や腑澶瓨鍌ㄥ簱锛屽叾涓牳蹇冨姛鑳藉湪婧愬ご杩涜娴嬭瘯骞跺湪璁稿椤圭洰涓噸鐢ㄣ€傝繖鏄剧潃鍑忓皯浜嗘瘡涓柊椤圭洰浠g爜搴撶殑娴嬭瘯宸ヤ綔閲忋€? - `regression`锛氭兂瑙i噴鍥炲綊娴嬭瘯涓亣鍒扮殑鏂伴敊璇紝杩欐牱灏卞彲浠ョ‘淇濆皢鏉ヤ笉浼氶噸鏂板紩鍏ョ浉鍚岀殑閿欒銆? - `coverage`锛氬笇鏈涚‘淇濅唬鐮佸簱[100% 瑕嗙洊](https://franztao.github.io/2022/10/01/Testing//#coverage)銆傝繖骞朵笉鎰忓懗鐫€瑕佷负姣忎竴琛屼唬鐮佺紪鍐欐祴璇曪紝鑰屾槸瑕佽€冭檻姣忎竴琛屼唬鐮併€? - `automate`锛氬鏋滃繕璁板湪鎻愪氦鍒板瓨鍌ㄥ簱涔嬪墠杩愯娴嬭瘯锛屽笇鏈涘湪瀵逛唬鐮佸簱杩涜鏇存敼鏃惰嚜鍔ㄨ繍琛屾祴璇曘€傚皢鍦ㄥ悗缁绋嬩腑瀛︿範濡備綍浣跨敤[棰勬彁浜ook鍦ㄦ湰鍦版墽琛屾鎿嶄綔锛屽苟閫氳繃](https://franztao.github.io/2022/10/01/Testing//../pre-commit/)[GitHub 鎿嶄綔](https://franztao.github.io/2022/10/01/Testing//../cicd/#github-actions)杩滅▼鎵ц姝ゆ搷浣溿€? ## 娴嬭瘯椹卞姩寮€鍙? [娴嬭瘯椹卞姩寮€鍙?(TDD)](https://en.wikipedia.org/wiki/Test-driven_development)鏄湪缂栧啓鍔熻兘涔嬪墠缂栧啓娴嬭瘯浠ョ‘淇濆缁堢紪鍐欐祴璇曠殑杩囩▼銆傝繖涓庡厛缂栧啓鍔熻兘鐒跺悗鍐嶇紪鍐欐祴璇曞舰鎴愬姣斻€備互涓嬫槸瀵规鐨勬煡鐪嬶細 - 闅忕潃杩涙缂栧啓娴嬭瘯寰堝ソ锛屼絾杩欑‘瀹炴剰鍛崇潃 100% 鐨勬纭€с€? - 鍦ㄨ繘鍏ヤ唬鐮佹垨娴嬭瘯涔嬪墠锛屾渶鍒濈殑鏃堕棿搴旇鑺卞湪璁捐涓娿€? 濡傛灉杩欎簺娴嬭瘯娌℃湁鎰忎箟骞朵笖涓嶅寘鍚彲鑳界殑杈撳叆銆佷腑闂翠綋鍜岃緭鍑虹殑棰嗗煙锛岄偅涔堝畬缇庣殑瑕嗙洊骞朵笉鎰忓懗鐫€搴旂敤绋嬪簭娌℃湁閿欒銆傚洜姝わ紝搴旇鍦ㄩ潰涓撮敊璇椂鏈濈潃鏇村ソ鐨勮璁″拰鏁忔嵎鎬у姫鍔涳紝蹇€熻В鍐冲畠浠苟鍥寸粫瀹冧滑缂栧啓娴嬭瘯鐢ㄤ緥浠ラ伩鍏嶄笅涓€娆°€? ## 搴旂敤 鍦╗搴旂敤绋嬪簭](https://github.com/GokuMohandas/mlops-course)涓紝灏嗘祴璇曚唬鐮併€佹暟鎹拰妯″瀷銆傚皢棣栧厛鍒涘缓涓€涓猔tests`甯︽湁`code`瀛愮洰褰曠殑鍗曠嫭鐩綍鏉ユ祴璇昤tagifai`鑴氭湰銆傚皢鍦ㄤ笅闈㈠垱寤虹敤浜庢祴璇昜鏁版嵁](https://franztao.github.io/2022/10/01/Testing//#馃敘nbsp-data)鍜孾妯″瀷](https://franztao.github.io/2022/10/01/Testing//#馃nbsp-models)鐨勫瓙鐩綍銆? ``` mkdir tests cd tests mkdir app config model tagifai touch cd ../ ``` ``` tests/ 鈹斺攢鈹€ code/ 鈹? 鈹溾攢鈹€ test_data.py 鈹? 鈹溾攢鈹€ test_evaluate.py 鈹? 鈹溾攢鈹€ test_main.py 鈹? 鈹溾攢鈹€ test_predict.py 鈹? 鈹斺攢鈹€ test_utils.py ``` 鍦ㄥ涔犱簡鏈涓殑鎵€鏈夋蹇礯鍚巁锛岃闅忔剰缂栧啓娴嬭瘯骞跺皢瀹冧滑缁勭粐鍦ㄨ繖浜涜剼鏈腑銆傚缓璁娇鐢╗`tests`](https://github.com/GokuMohandas/mlops-course/tree/main/tests)鍦?GitHub 涓婄殑鐩綍浣滀负鍙傝€冦€? > 璇锋敞鎰忥紝`tagifai/train.py`鑴氭湰娌℃湁鐩稿簲鐨刞tests/code/test_train.py`. 涓€浜涜剼鏈叿鏈夊甫鏈変緷璧栭」锛堜緥濡傚伐浠讹級鐨勫ぇ鍨嬪嚱鏁帮紙渚嬪`train.train()`銆乣train.optimize()`銆佺瓑锛夛紝閫氳繃.`predict.predict()``tests/code/test_main.py` ## 馃И Pytest 灏嗕娇鐢╗pytest](https://docs.pytest.org/en/stable/)浣滀负娴嬭瘯妗嗘灦锛屽洜涓哄畠鍏锋湁寮哄ぇ鐨勫唴缃姛鑳斤紝渚嬪[鍙傛暟鍖朷(https://franztao.github.io/2022/10/01/Testing//#parametrize)銆乕鍥哄畾瑁呯疆](https://franztao.github.io/2022/10/01/Testing//#fixtures)銆乕鏍囪](https://franztao.github.io/2022/10/01/Testing//#markers)绛夈€? ``` pip install pytest==7.1.2 ``` 鐢变簬杩欎釜娴嬭瘯鍖呬笉鏄牳蹇冩満鍣ㄥ涔犳搷浣滅殑缁勬垚閮ㄥ垎锛岃鍦ㄤ腑鍒涘缓涓€涓崟鐙殑鍒楄〃`setup.py`骞跺皢鍏舵坊鍔犲埌`extras_require`锛? ``` # setup.py test_packages = [ "pytest==7.1.2", ] # Define our package setup( ... extras_require={ "dev": docs_packages + style_packages + test_packages, "docs": docs_packages, "test": test_packages, }, ) ``` 鍒涘缓浜嗕竴涓槑纭殑`test`閫夐」锛屽洜涓虹敤鎴峰彧鎯充笅杞芥祴璇曞寘銆俒褰撲娇鐢–I/CD 宸ヤ綔娴乚(https://franztao.github.io/2022/10/01/Testing//../cicd/)閫氳繃 GitHub Actions 杩愯娴嬭瘯鏃讹紝灏嗙湅鍒拌繖涓€鐐广€? ### 閰嶇疆 Pytest 鏈熸湜娴嬭瘯鍦╜tests`榛樿鎯呭喌涓嬬粍缁囧湪涓€涓洰褰曚笅銆備絾鏄紝涔熷彲浠ユ坊鍔犲埌鐜版湁`pyproject.toml`鏂囦欢涓互閰嶇疆浠讳綍鍏朵粬娴嬭瘯鐩綍銆傝繘鍏ョ洰褰曞悗锛宲ytest 浼氭煡鎵句互 寮€澶寸殑 python 鑴氭湰锛宍tests_*.py`浣嗕篃鍙互灏嗗叾閰嶇疆涓鸿鍙栦换浣曞叾浠栨枃浠舵ā寮忋€? ``` # Pytest [tool.pytest.ini_options] testpaths = ["tests"] python_files = "test_*.py" ``` ### 鏂█ 璁╃湅鐪嬫牱鏈祴璇曞強鍏剁粨鏋滄槸浠€涔堟牱鐨勩€傚亣璁炬湁涓€涓畝鍗曠殑鍑芥暟鏉ョ‘瀹氭按鏋滄槸鍚﹁剢锛? ``` # food/fruits.py def is_crisp(fruit): if fruit: fruit = fruit.lower() if fruit in ["apple", "watermelon", "cherries"]: return True elif fruit in ["orange", "mango", "strawberry"]: return False else: raise ValueError(f"{fruit} not in known list of fruits.") return False ``` 涓轰簡娴嬭瘯杩欎釜鍔熻兘锛屽彲浠ヤ娇鐢╗鏂█璇彞](https://docs.pytest.org/en/stable/assert.html)鏉ユ槧灏勮緭鍏ュ拰棰勬湡鐨勮緭鍑恒€傚崟璇嶅悗闈㈢殑璇彞`assert`蹇呴』杩斿洖 True銆? ``` # tests/food/test_fruits.py def test_is_crisp(): assert is_crisp(fruit="apple") assert is_crisp(fruit="Apple") assert not is_crisp(fruit="orange") with pytest.raises(ValueError): is_crisp(fruit=None) is_crisp(fruit="pear") ``` > 杩樺彲浠ュ[寮傚父](https://docs.pytest.org/en/stable/assert.html#assertions-about-expected-exceptions)杩涜鏂█锛屽氨鍍忓湪绗?6-8 琛屼腑鎵€鍋氱殑閭f牱锛屽叾涓?with 璇彞涓嬬殑鎵€鏈夋搷浣滈兘搴旇寮曞彂鎸囧畾鐨勫紓甯搞€? > `assert`鍦ㄩ」鐩腑浣跨敤鐨勪緥瀛? > > ``` > # tests/code/test_evaluate.py > def test_get_metrics(): > y_true = np.array([0, 0, 1, 1]) > y_pred = np.array([0, 1, 0, 1]) > classes = ["a", "b"] > performance = evaluate.get_metrics(y_true=y_true, y_pred=y_pred, classes=classes, df=None) > assert performance["overall"]["precision"] == 2/4 > assert performance["overall"]["recall"] == 2/4 > assert performance["class"]["a"]["precision"] == 1/2 > assert performance["class"]["a"]["recall"] == 1/2 > assert performance["class"]["b"]["precision"] == 1/2 > assert performance["class"]["b"]["recall"] == 1/2 > ``` ### 鎵ц 鍙互浣跨敤鍑犱釜涓嶅悓鐨勭矑搴︾骇鍒墽琛屼笂闈㈢殑娴嬭瘯锛? ``` python3 -m pytest # all tests python3 -m pytest tests/food # tests under a directory python3 -m pytest tests/food/test_fruits.py # tests for a single file python3 -m pytest tests/food/test_fruits.py::test_is_crisp # tests for a single function ``` 鍦ㄤ笂闈㈣繍琛岀壒瀹氭祴璇曞皢浜х敓浠ヤ笅杈撳嚭锛? ``` python3 -m pytest tests/food/test_fruits.py::test_is_crisp ``` ``` tests/food/test_fruits.py::test_is_crisp . [100%] ``` 濡傛灉鍦ㄦ娴嬭瘯涓殑浠讳綍鏂█澶辫触锛屽皢鐪嬪埌澶辫触鐨勬柇瑷€锛屼互鍙婂嚱鏁扮殑棰勬湡鍜屽疄闄呰緭鍑恒€? ``` tests/food/test_fruits.py F [100%] def test_is_crisp(): > assert is_crisp(fruit="orange") E AssertionError: assert False E + where False = is_crisp(fruit='orange') ``` > tip > > 閲嶈鐨勬槸瑕佹祴璇昜涓婇潰](https://franztao.github.io/2022/10/01/Testing//#how-should-we-test)姒傝堪鐨勫悇绉嶈緭鍏ュ拰棰勬湡杈撳嚭锛屽苟涓?*姘歌繙涓嶈鍋囪娴嬭瘯鏄井涓嶈冻閬撶殑**銆傚湪涓婇潰鐨勪緥瀛愪腑锛屽鏋滃嚱鏁版病鏈夎€冭檻澶у皬鍐欙紝娴嬭瘯鈥渁pple鈥濆拰鈥淎pple鈥濇槸寰堥噸瑕佺殑锛? ### Classes 杩樺彲浠ラ€氳繃鍒涘缓娴嬭瘯绫绘潵娴嬭瘯绫诲強鍏跺悇鑷殑鍔熻兘銆傚湪娴嬭瘯绫讳腑锛屽彲浠ラ€夋嫨瀹氫箟鍦ㄨ缃垨鎷嗛櫎绫诲疄渚嬫垨浣跨敤绫绘柟娉曟椂鑷姩鎵ц鐨刐鍑芥暟銆俔(https://docs.pytest.org/en/stable/xunit_setup.html) - `setup_class`锛氫负浠讳綍绫诲疄渚嬭缃姸鎬併€? - `teardown_class`: 鎷嗛櫎 setup\_class 涓垱寤虹殑鐘舵€併€? - `setup_method`锛氬湪姣忎釜鏂规硶涔嬪墠璋冪敤浠ヨ缃换浣曠姸鎬併€? - `teardown_method`锛氬湪姣忎釜鏂规硶涔嬪悗璋冪敤浠ユ媶闄や换浣曠姸鎬併€? ``` class Fruit(object): def __init__(self, name): self.name = name class TestFruit(object): @classmethod def setup_class(cls): """Set up the state for any class instance.""" pass @classmethod def teardown_class(cls): """Teardown the state created in setup_class.""" pass def setup_method(self): """Called before every method to setup any state.""" self.fruit = Fruit(name="apple") def teardown_method(self): """Called after every method to teardown any state.""" del self.fruit def test_init(self): assert self.fruit.name == "apple" ``` 鍙互閫氳繃鎸囧畾绫诲悕鏉ヤ负绫绘墽琛屾墍鏈夋祴璇曪細 ``` python3 -m pytest tests/food/test_fruits.py::TestFruit ``` ``` tests/food/test_fruits.py::TestFruit . [100%] ``` > `class`鍦ㄩ」鐩腑娴嬭瘯 鐨勭ず渚? > > ``` > # tests/code/test_data.py > class TestLabelEncoder: > @classmethod > def setup_class(cls): > """Called before every class initialization.""" > pass > > @classmethod > def teardown_class(cls): > """Called after every class initialization.""" > pass > > def setup_method(self): > """Called before every method.""" > self.label_encoder = data.LabelEncoder() > > def teardown_method(self): > """Called after every method.""" > del self.label_encoder > > def test_empty_init(self): > label_encoder = data.LabelEncoder() > assert label_encoder.index_to_class == {} > assert len(label_encoder.classes) == 0 > > def test_dict_init(self): > class_to_index = {"apple": 0, "banana": 1} > label_encoder = data.LabelEncoder(class_to_index=class_to_index) > assert label_encoder.index_to_class == {0: "apple", 1: "banana"} > assert len(label_encoder.classes) == 2 > > def test_len(self): > assert len(self.label_encoder) == 0 > > def test_save_and_load(self): > with tempfile.TemporaryDirectory() as dp: > fp = Path(dp, "label_encoder.json") > self.label_encoder.save(fp=fp) > label_encoder = data.LabelEncoder.load(fp=fp) > assert len(label_encoder.classes) == 0 > > def test_str(self): > assert str(data.LabelEncoder()) == "" > > def test_fit(self): > label_encoder = data.LabelEncoder() > label_encoder.fit(["apple", "apple", "banana"]) > assert "apple" in label_encoder.class_to_index > assert "banana" in label_encoder.class_to_index > assert len(label_encoder.classes) == 2 > > def test_encode_decode(self): > class_to_index = {"apple": 0, "banana": 1} > y_encoded = [0, 0, 1] > y_decoded = ["apple", "apple", "banana"] > label_encoder = data.LabelEncoder(class_to_index=class_to_index) > label_encoder.fit(["apple", "apple", "banana"]) > assert np.array_equal(label_encoder.encode(y_decoded), np.array(y_encoded)) > assert label_encoder.decode(y_encoded) == y_decoded > ``` ### 鍙傛暟鍖? 鍒扮洰鍓嶄负姝紝鍦ㄦ祴璇曚腑锛屽繀椤诲垱寤哄崟鐙殑鏂█璇彞鏉ラ獙璇佽緭鍏ュ拰棰勬湡杈撳嚭鐨勪笉鍚岀粍鍚堛€傜劧鑰岋紝杩欓噷鏈変竴鐐瑰啑浣欙紝鍥犱负杈撳叆鎬绘槸浣滀负鍙傛暟杈撳叆鍒板嚱鏁颁腑锛屽苟涓旇緭鍑轰笌棰勬湡杈撳嚭杩涜姣旇緝銆備负浜嗘秷闄よ繖绉嶅啑浣欙紝pytest 鏈変竴涓猍`@pytest.mark.parametrize`](https://docs.pytest.org/en/stable/parametrize.html)瑁呴グ鍣紝瀹冨厑璁稿皢杈撳叆鍜岃緭鍑鸿〃绀轰负鍙傛暟銆? ``` @pytest.mark.parametrize( "fruit, crisp", [ ("apple", True), ("Apple", True), ("orange", False), ], ) def test_is_crisp_parametrize(fruit, crisp): assert is_crisp(fruit=fruit) == crisp ``` ``` python3 -m pytest tests/food/test_is_crisp_parametrize.py ... [100%] ``` 1. `[Line 2]`锛氬畾涔夎楗板櫒涓嬬殑鍙傛暟鍚嶇О锛屼緥濡傘€傗€渇ruit, crisp鈥濓紙娉ㄦ剰杩欐槸涓€涓瓧绗︿覆锛夈€? 2. `[Lines 3-7]`锛氭彁渚涙楠?1 涓弬鏁扮殑鍊肩粍鍚堝垪琛ㄣ€? 3. `[Line 9]`锛氬皢鍙傛暟鍚嶇О浼犻€掔粰娴嬭瘯鍑芥暟銆? 4. `[Line 10]`锛氬寘鎷繀瑕佺殑鏂█璇彞锛岃繖浜涜鍙ュ皢涓烘楠?2 涓垪琛ㄤ腑鐨勬瘡涓粍鍚堟墽琛屻€? 鍚屾牱锛屼篃鍙互浼犲叆涓€涓紓甯镐綔涓洪鏈熺粨鏋滐細 ``` @pytest.mark.parametrize( "fruit, exception", [ ("pear", ValueError), ], ) def test_is_crisp_exceptions(fruit, exception): with pytest.raises(exception): is_crisp(fruit=fruit) ``` > `parametrize`椤圭洰涓殑绀轰緥 > > ``` > # tests/code/test_data.py > from tagifai import data > @pytest.mark.parametrize( > "text, lower, stem, stopwords, cleaned_text", > [ > ("Hello worlds", False, False, [], "Hello worlds"), > ("Hello worlds", True, False, [], "hello worlds"), > ... > ], > ) > def test_preprocess(text, lower, stem, stopwords, cleaned_text): > assert ( > data.clean_text( > text=text, > lower=lower, > stem=stem, > stopwords=stopwords, > ) > == cleaned_text > ) > ``` ### Fixtures 鍙傛暟鍖栧厑璁稿噺灏戞祴璇曞嚱鏁板唴閮ㄧ殑鍐椾綑锛屼絾鏄浣曞噺灏戜笉鍚屾祴璇曞嚱鏁颁箣闂寸殑鍐椾綑鍛紵渚嬪锛屽亣璁句笉鍚岀殑鍑芥暟閮芥湁涓€涓暟鎹浣滀负杈撳叆銆傚湪杩欓噷锛屽彲浠ヤ娇鐢╬ytest鐨勫唴缃甗fixture](https://docs.pytest.org/en/stable/fixture.html)锛屽畠鏄竴涓湪test鍑芥暟涔嬪墠鎵ц鐨勫嚱鏁般€? ``` @pytest.fixture def my_fruit(): fruit = Fruit(name="apple") return fruit def test_fruit(my_fruit): assert my_fruit.name == "apple" ``` > 璇锋敞鎰忥紝fixture 鐨勫悕绉板拰 test 鍑芥暟鐨勮緭鍏ユ槸鐩稿悓鐨?( `my_fruit`)銆? 涔熷彲浠ュ皢fixture 搴旂敤鍒扮被涓紝褰撹皟鐢ㄧ被涓殑浠讳綍鏂规硶鏃堕兘浼氳皟鐢╢ixture 鍑芥暟銆? ``` @pytest.mark.usefixtures("my_fruit") class TestFruit: ... ``` > `fixtures`椤圭洰涓殑绀轰緥 > > 鍦╰ransformers椤圭洰涓紝浣跨敤鍥哄畾瑁呯疆鏈夋晥鍦板皢涓€缁勮緭鍏ワ紙渚嬪 Pandas DataFrame锛変紶閫掔粰闇€瑕佸畠浠殑涓嶅悓娴嬭瘯鍔熻兘锛堟竻鐞嗐€佹媶鍒嗙瓑锛夈€? > > ``` > # tests/code/test_data.py > @pytest.fixture(scope="module") > def df(): > data = [ > {"title": "a0", "description": "b0", "tag": "c0"}, > {"title": "a1", "description": "b1", "tag": "c1"}, > {"title": "a2", "description": "b2", "tag": "c1"}, > {"title": "a3", "description": "b3", "tag": "c2"}, > {"title": "a4", "description": "b4", "tag": "c2"}, > {"title": "a5", "description": "b5", "tag": "c2"}, > ] > df = pd.DataFrame(data * 10) > return df > > > @pytest.mark.parametrize( > "labels, unique_labels", > [ > ([], ["other"]), # no set of approved labels > (["c3"], ["other"]), # no overlap b/w approved/actual labels > (["c0"], ["c0", "other"]), # partial overlap > (["c0", "c1", "c2"], ["c0", "c1", "c2"]), # complete overlap > ], > ) > def test_replace_oos_labels(df, labels, unique_labels): > replaced_df = data.replace_oos_labels( > df=df.copy(), labels=labels, label_col="tag", oos_label="other" > ) > assert set(replaced_df.tag.unique()) == set(unique_labels) > ``` > 璇锋敞鎰忥紝涓嶅湪鍙傛暟鍖栨祴璇曞嚱鏁癭df`涓洿鎺ヤ娇鐢╢ixture锛堜紶鍏ワ級銆俙df.copy()`濡傛灉杩欐牱鍋氫簡锛岄偅涔堝皢`df`鍦ㄦ瘡娆″弬鏁板寲鍚庢洿鏀?鐨勫€笺€? > > > Tip > > > > 鍦ㄥ洿缁曟暟鎹泦鍒涘缓鍥哄畾瑁呯疆鏃讹紝鏈€浣冲仛娉曟槸鍒涘缓涓€涓粛鐒堕伒寰浉鍚屾ā寮忕殑绠€鍖栫増鏈€備緥濡傦紝鍦ㄤ笂闈㈢殑鏁版嵁妗嗗浐瀹氳缃腑锛屾鍦ㄥ垱寤轰竴涓緝灏忕殑鏁版嵁妗嗭紝瀹冧粛鐒跺叿鏈変笌瀹為檯鏁版嵁妗嗙浉鍚岀殑鍒楀悕銆傝櫧鐒跺彲浠ュ姞杞絫ransformers瀹為檯鏁版嵁闆嗭紝浣嗛殢鐫€transformers鏁版嵁闆嗛殢鏃堕棿鍙樺寲锛堟柊鏍囩銆佸垹闄ゆ爣绛俱€侀潪甯稿ぇ鐨勬暟鎹泦绛夛級锛屽畠鍙兘浼氬鑷撮棶棰? Fixtures 鍙互鏈変笉鍚岀殑鑼冨洿锛岃繖鍙栧喅浜庡浣曚娇鐢ㄥ畠浠€備緥濡傦紝`df`fixture鍏锋湁妯″潡鑼冨洿锛屽洜涓轰笉鎯冲湪姣忔娴嬭瘯鍚庨兘閲嶆柊鍒涘缓瀹冿紝鑰屾槸甯屾湜涓烘ā鍧椾腑鐨勬墍鏈夋祴璇曞彧鍒涘缓涓€娆★紙`tests/test_data.py`锛夈€? - `function`: 姣忔娴嬭瘯鍚庯紝fixture 閮戒細琚攢姣併€俙[default]` - `class`锛歠ixture鍦ㄧ被涓殑鏈€鍚庝竴娆℃祴璇曞悗琚攢姣併€? - `module`锛歠ixture鍦ㄦā鍧楋紙鑴氭湰锛変腑鐨勬渶鍚庝竴娆℃祴璇曞悗琚攢姣併€? - `package`锛歠ixture鍦ㄥ寘涓殑鏈€鍚庝竴娆℃祴璇曞悗琚攢姣併€? - `session`锛歠ixture鍦ㄤ細璇濈殑鏈€鍚庝竴娆℃祴璇曞悗琚攢姣併€? 鍔熻兘鏄渶浣庣骇鍒殑鑼冨洿锛岃€孾浼氳瘽](https://docs.pytest.org/en/6.2.x/fixture.html#scope-sharing-fixtures-across-classes-modules-packages-or-session)鏄渶楂樼骇鍒€傞鍏堟墽琛屾渶楂樼骇鍒殑鑼冨洿鍥哄畾瑁呯疆銆? > 閫氬父锛屽綋鍦ㄤ竴涓壒瀹氱殑娴嬭瘯鏂囦欢涓湁璁稿fixture鏃讹紝鍙互灏嗗畠浠叏閮ㄧ粍缁囧湪涓€涓猔fixtures.py`鑴氭湰涓苟鏍规嵁闇€瑕佽皟鐢ㄥ畠浠€? ### 鏍囪 宸茬粡鑳藉浠ュ悇绉嶇矑搴︾骇鍒紙鎵€鏈夋祴璇曘€佽剼鏈€佸嚱鏁扮瓑锛夋墽琛屾祴璇曪紝浣嗗彲浠ヤ娇鐢╗鏍囪](https://docs.pytest.org/en/stable/mark.html)鍒涘缓鑷畾涔夌矑搴︺€傚凡缁忎娇鐢ㄤ簡涓€绉嶇被鍨嬬殑鏍囪锛堝弬鏁板寲锛夛紝浣嗚繕鏈夊叾浠栧嚑绉峓鍐呯疆鏍囪](https://docs.pytest.org/en/stable/mark.html#mark)銆備緥濡傦紝[`skipif`](https://docs.pytest.org/en/stable/skipping.html#id1)濡傛灉婊¤冻鏉′欢锛屾爣璁板厑璁歌烦杩囨祴璇曠殑鎵ц銆備緥濡傦紝鍋囪鍙兂鍦?GPU 鍙敤鏃舵祴璇曡缁冩ā鍨嬶細 ``` @pytest.mark.skipif( not torch.cuda.is_available(), reason="Full training tests require a GPU." ) def test_training(): pass ``` [闄や簡涓€浜涗繚鐣橾(https://docs.pytest.org/en/stable/reference.html#marks)鐨勬爣璁板悕绉板锛岃繕鍙互鍒涘缓鑷繁鐨勮嚜瀹氫箟鏍囪銆? ``` @pytest.mark.fruits def test_fruit(my_fruit): assert my_fruit.name == "apple" ``` `-m`鍙互浣跨敤闇€瑕侊紙鍖哄垎澶у皬鍐欙級鏍囪琛ㄨ揪寮忕殑鏍囧織鏉ユ墽琛屽畠浠紝濡備笅鎵€绀猴細 ``` pytest -m "fruits" # runs all tests marked with `fruits` pytest -m "not fruits" # runs all tests besides those marked with `fruits` ``` > tip > > 浣跨敤鏍囪鐨勬纭柟娉曟槸鏄庣‘鍒楀嚭鍦╗pyproject.toml](https://github.com/GokuMohandas/mlops-course/blob/main/pyproject.toml)鏂囦欢涓垱寤虹殑鏍囪銆傚湪杩欓噷锛屽彲浠ユ寚瀹氬繀椤诲湪姝ゆ枃浠朵腑浣跨敤`--strict-markers`鏍囧織瀹氫箟鎵€鏈夋爣璁帮紝鐒跺悗鍦╜markers`鍒楄〃涓0鏄庢爣璁帮紙浠ュ強鏈夊叧瀹冧滑鐨勪竴浜涗俊鎭級锛? > > ``` > @pytest.mark.training > def test_train_model(): > assert ... > ``` > ``` > # Pytest > [tool.pytest.ini_options] > testpaths = ["tests"] > python_files = "test_*.py" > addopts = "--strict-markers --disable-pytest-warnings" > markers = [ > "training: tests that involve training", > ] > ``` > 瀹屾垚姝ゆ搷浣滃悗锛屽彲浠ラ€氳繃鎵ц鏌ョ湅鎵€鏈夌幇鏈夌殑鏍囪鍒楄〃锛宍pytest --markers`褰撳皾璇曚娇鐢ㄦ澶勬湭瀹氫箟鐨勬柊鏍囪鏃朵細鏀跺埌閿欒娑堟伅銆? ### 瑕嗙洊鑼冨洿 褰撲负搴旂敤绋嬪簭鐨勭粍浠跺紑鍙戞祴璇曟椂锛岄噸瑕佺殑鏄鐭ラ亾瀵逛唬鐮佸簱鐨勮鐩栫▼搴︿互鍙婄煡閬撴槸鍚﹂仐婕忎簡浠讳綍涓滆タ銆傚彲浠ヤ娇鐢╗Coverage](https://coverage.readthedocs.io/)搴撴潵璺熻釜鍜屽彲瑙嗗寲娴嬭瘯鍗犱唬鐮佸簱鐨勫灏戙€備娇鐢?pytest锛岀敱浜嶽pytest-cov](https://pytest-cov.readthedocs.io/)鎻掍欢锛屼娇鐢ㄨ繖涓寘鍙樺緱鏇村姞瀹规槗銆? ``` pip install pytest-cov==2.10.1 ``` 灏嗘妸瀹冩坊鍔犲埌`setup.py`鑴氭湰涓細 ``` # setup.py test_packages = [ "pytest==7.1.2", "pytest-cov==2.10.1" ] ``` ``` python3 -m pytest --cov tagifai --cov-report html ``` ![pytest](https://upload-images.jianshu.io/upload_images/27840083-28b8d8e511a73d8d.png) 鍦ㄨ繖閲岋紝瑕佹眰瑕嗙洊 tagifai 鍜?app 鐩綍涓殑鎵€鏈変唬鐮侊紝骞朵互 HTML 鏍煎紡鐢熸垚鎶ュ憡銆傚綋杩愯瀹冩椂锛屽皢鐪嬪埌娴嬭瘯鐩綍涓殑娴嬭瘯姝e湪鎵ц锛岃€岃鐩栨彃浠舵鍦ㄨ窡韪簲鐢ㄧ▼搴忎腑鐨勫摢浜涜姝e湪鎵ц銆傛祴璇曞畬鎴愬悗锛屽彲浠ユ煡鐪嬬敓鎴愮殑鎶ュ憡锛堥粯璁や负`htmlcov/index.html`锛夊苟鍗曞嚮鍚勪釜鏂囦欢浠ユ煡鐪嬪摢浜涢儴鍒嗘湭琚换浣曟祴璇曡鐩栥€傚綋蹇樿娴嬭瘯鏌愪簺鏉′欢銆佸紓甯哥瓑鏃讹紝杩欏挨鍏舵湁鐢ㄣ€? ![娴嬭瘯瑕嗙洊鐜嘳(https://upload-images.jianshu.io/upload_images/27840083-88727a764a09e446.png) > warning > > 铏界劧鏈?100% 鐨勮鐩栫巼锛屼絾杩欏苟涓嶆剰鍛崇潃搴旂敤绋嬪簭鏄畬缇庣殑銆傝鐩栫巼鍙槸琛ㄧず鍦ㄦ祴璇曚腑鎵ц鐨勪竴娈典唬鐮侊紝涓嶄竴瀹氭槸瀹冪殑姣忎竴閮ㄥ垎閮界粡杩囨祴璇曪紝鏇翠笉鐢ㄨ褰诲簳娴嬭瘯浜嗐€傚洜姝わ紝瑕嗙洊鐜?*姘歌繙**涓嶅簲琚敤浣滄纭€х殑琛ㄧず銆備絾鏄紝灏嗚鐩栫巼淇濇寔鍦?100% 闈炲父鏈夌敤锛岃繖鏍峰氨鍙互鐭ラ亾鏂板姛鑳戒綍鏃跺皻鏈祴璇曘€傚湪 CI/CD 璇剧▼涓紝灏嗕簡瑙e湪鎺ㄩ€佸埌鐗瑰畾鍒嗘敮鏃跺浣曚娇鐢?GitHub 鎿嶄綔鏉ュ疄鐜?100% 鐨勮鐩栫巼銆? ### 鎺掗櫎椤? 鏈夋椂缂栧啓娴嬭瘯鏉ヨ鐩栧簲鐢ㄧ▼搴忎腑鐨勬瘡涓€琛屾槸娌℃湁鎰忎箟鐨勶紝浣嗕粛鐒跺笇鏈涜€冭檻杩欎簺琛岋紝浠ヤ究鍙互淇濇寔 100% 鐨勮鐩栫巼銆傚簲鐢ㄦ帓闄ゆ椂锛屾湁涓や釜绾у埆鐨勬潈闄愶細 1. 閫氳繃娣诲姞姝よ瘎璁烘潵鍘熻皡琛宍# pragma: no cover, ` ``` if trial: # pragma: no cover, optuna pruning trial.report(val_loss, epoch) if trial.should_prune(): raise optuna.TrialPruned() ``` 2. `pyproject.toml`閫氳繃鍦ㄩ厤缃腑鎸囧畾鏂囦欢鏉ユ帓闄ゆ枃浠讹細 ``` # Pytest coverage [tool.coverage.run] omit = ["app/gunicorn.py"] ``` > 閲嶇偣鏄兘澶熼€氳繃璇勮涓鸿繖浜涙帓闄ら」娣诲姞鐞嗙敱锛屼互渚垮洟闃熷彲浠ラ伒寰帹鐞嗐€? 鐜板湪宸茬粡鏈変簡娴嬭瘯浼犵粺杞欢鐨勫熀纭€锛岃鍦ㄦ満鍣ㄥ涔犵郴缁熺殑鑳屾櫙涓嬫繁鍏ユ祴璇曟暟鎹拰妯″瀷銆? ## 鏁版嵁 鍒扮洰鍓嶄负姝紝宸茬粡浣跨敤鍗曞厓娴嬭瘯鍜岄泦鎴愭祴璇曟潵娴嬭瘯涓巘ransformers鏁版嵁浜や簰鐨勫姛鑳斤紝浣嗚繕娌℃湁娴嬭瘯鏁版嵁鏈韩鐨勬湁鏁堟€с€傚皢浣跨敤[great expectations](https://github.com/great-expectations/great_expectations)搴撴潵娴嬭瘯transformers鏁版嵁棰勬湡鐨勬牱瀛愩€傚畠鏄竴涓簱锛屼娇鑳藉浠ユ爣鍑嗗寲鐨勬柟寮忓垱寤哄叧浜巘ransformers鏁版嵁搴旇鏄粈涔堟牱瀛愮殑鏈熸湜銆傚畠杩樻彁渚涗簡涓庡悗绔暟鎹簮锛堝鏈湴鏂囦欢绯荤粺銆丼3銆佹暟鎹簱绛夛級鏃犵紳杩炴帴鐨勬ā鍧椼€傝閫氳繃瀹炵幇瀵瑰簲鐢ㄧ▼搴忔墍闇€鐨勬湡鏈涙潵鎺㈢储璇ュ簱銆? > 馃憠璺熼殢浜や簰寮弉ote鍦燵**testing-ml**](https://github.com/GokuMohandas/testing-ml)瀛樺偍搴擄紝鍥犱负瀹炵幇浜嗕互涓嬫蹇点€? ``` pip install great-expectations==0.15.15 ``` 灏嗘妸瀹冩坊鍔犲埌transformers`setup.py`鑴氭湰涓細 ``` # setup.py test_packages = [ "pytest==7.1.2", "pytest-cov==2.10.1", "great-expectations==0.15.15" ] ``` 棣栧厛锛屽皢鍔犺浇鎯宠搴旂敤transformers鏈熸湜鐨勬暟鎹€傚彲浠ヤ粠鍚勭[鏉ユ簮](https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview)锛堟枃浠剁郴缁熴€佹暟鎹簱銆佷簯绛夛級鍔犺浇transformers鏁版嵁锛岀劧鍚庡彲浠ュ皢鍏跺寘瑁呭湪涓€涓猍鏁版嵁闆嗘ā鍧梋(https://legacy.docs.greatexpectations.io/en/latest/autoapi/great_expectations/dataset/index.html)锛圥andas/Spark DataFrame銆丼QLAlchemy锛変腑銆? ``` import great_expectations as ge import json import pandas as pd from urllib.request import urlopen ``` ``` # Load labeled projects projects = pd.read_csv("https://raw.githubusercontent.com/GokuMohandas/Made-With-ML/main/datasets/projects.csv") tags = pd.read_csv("https://raw.githubusercontent.com/GokuMohandas/Made-With-ML/main/datasets/tags.csv") df = ge.dataset.PandasDataset(pd.merge(projects, tags, on="id")) print (f"{len(df)} projects") df.head(5) ``` ``` # Load labeled projects projects = pd.read_csv("https://raw.githubusercontent.com/GokuMohandas/Made-With-ML/main/datasets/projects.csv") tags = pd.read_csv("https://raw.githubusercontent.com/GokuMohandas/Made-With-ML/main/datasets/tags.csv") df = ge.dataset.PandasDataset(pd.merge(projects, tags, on="id")) print (f"{len(df)} projects") df.head(5) ``` | | id | created_on | title | description | tag | | --- | --- | ------------------- | ------------------------------------------------- | --------------------------------------------------- | ---------------------- | | 0 | 6 | 2020-02-20 06:43:18 | Comparison between YOLO and RCNN on real world... | Bringing theory to experiment is cool. We can ... | computer-vision | | 1 | 7 | 2020-02-20 06:47:21 | Show, Infer & Tell: Contextual Inference for C... | The beauty of the work lies in the way it arch... | computer-vision | | 2 | 9 | 2020-02-24 16:24:45 | Awesome Graph Classification | A collection of important graph embedding, cla... | graph-learning | | 3 | 15 | 2020-02-28 23:55:26 | Awesome Monte Carlo Tree Search | A curated list of Monte Carlo tree search papers... | reinforcement-learning | | 4 | 19 | 2020-03-03 13:54:31 | Diffusion to Vector | Reference implementation of Diffusion2Vec (Com... | graph-learning | ### 鏈熸湜 鍦ㄥtransformers鏁版嵁搴旇鏄粈涔堟牱瀛愬缓绔嬫湡鏈涙椂锛岃鑰冭檻transformers鏁翠釜鏁版嵁闆嗗拰鍏朵腑鐨勬墍鏈夌壒寰侊紙鍒楋級銆? `# Presence of specific features df.expect_table_columns_to_match_ordered_list( column_list=["id", "created_on", "title", "description", "tag"] )` `# Unique combinations of features (detect data leaks!) df.expect_compound_columns_to_be_unique(column_list=["title", "description"])` `# Missing values df.expect_column_values_to_not_be_null(column="tag")` `# Unique values df.expect_column_values_to_be_unique(column="id")` `# Type adherence df.expect_column_values_to_be_of_type(column="title", type_="str")` `# List (categorical) / range (continuous) of allowed values tags = ["computer-vision", "graph-learning", "reinforcement-learning", "natural-language-processing", "mlops", "time-series"] df.expect_column_values_to_be_in_set(column="tag", value_set=tags)` 杩欎簺鏈熸湜涓殑姣忎竴涓兘浼氬垱寤轰竴涓緭鍑猴紝鍏朵腑鍖呭惈鏈夊叧鎴愬姛鎴栧け璐ャ€侀鏈熷拰瑙傚療鍒扮殑鍊笺€佹彁鍑虹殑鏈熸湜绛夎缁嗕俊鎭€備緥濡傦紝濡傛灉鎴愬姛锛屾湡鏈涘皢浜х敓浠ヤ笅鍐呭锛歚df.expect_column_values_to_be_of_type(column="title",聽type_="str")` { "exception_info": { "raised_exception": false, "exception_traceback": null, "exception_message": null }, "success": true, "meta": {}, "expectation_config": { "kwargs": { "column": "title", "type_": "str", "result_format": "BASIC" }, "meta": {}, "expectation_type": "_expect_column_values_to_be_of_type__map" }, "result": { "element_count": 955, "missing_count": 0, "missing_percent": 0.0, "unexpected_count": 0, "unexpected_percent": 0.0, "unexpected_percent_nonmissing": 0.0, "partial_unexpected_list": [] } } 濡傛灉鏈変竴涓け璐ョ殑鏈熸湜锛堜緥濡傦級锛屼細鏀跺埌杩欎釜杈撳嚭锛堟敞鎰忓鑷村け璐ョ殑鍘熷洜鐨勮鏁板拰绀轰緥锛夛細聽`df.expect_column_values_to_be_of_type(column="title",聽type_="int")` { "success": false, "exception_info": { "raised_exception": false, "exception_traceback": null, "exception_message": null }, "expectation_config": { "meta": {}, "kwargs": { "column": "title", "type_": "int", "result_format": "BASIC" }, "expectation_type": "_expect_column_values_to_be_of_type__map" }, "result": { "element_count": 955, "missing_count": 0, "missing_percent": 0.0, "unexpected_count": 955, "unexpected_percent": 100.0, "unexpected_percent_nonmissing": 100.0, "partial_unexpected_list": [ "How to Deal with Files in Google Colab: What You Need to Know", "Machine Learning Methods Explained (+ Examples)", "OpenMMLab Computer Vision", "...", ] }, "meta": {} } 鍙互鍒涢€犱竴浜涗笉鍚岀殑鏈熸湜銆備竴瀹氳鎺㈢储鎵€鏈夌殑[鏈熸湜](https://greatexpectations.io/expectations/)锛屽寘鎷琜鑷畾涔夋湡鏈沒(https://docs.greatexpectations.io/docs/guides/expectations/creating_custom_expectations/overview/)銆備互涓嬫槸涓€浜涗笌transformers鐗瑰畾鏁版嵁闆嗘棤鍏充絾骞挎硾閫傜敤鐨勫叾浠栨祦琛屾湡鏈涳細 - 鐗瑰緛鍊间笌鍏朵粬鐗瑰緛鍊肩殑鍏崇郴 鈫抈expect_column_pair_values_a_to_be_greater_than_b` - 鏍锋湰鐨勮鏁帮紙绮剧‘鎴栬寖鍥达級鈫抈expect_table_row_count_to_be_between` - 鏁板€肩粺璁★紙鍧囧€笺€佹爣鍑嗗樊銆佷腑浣嶆暟銆佹渶澶у€笺€佹渶灏忓€笺€佹€诲拰绛夛級鈫抈expect_column_mean_to_be_between` ### 缁勭粐 鍦ㄧ粍缁囨湡鏈涙椂锛屽缓璁粠琛ㄧ骇寮€濮嬶紝鐒跺悗杞埌鍚勪釜鍔熻兘鍒椼€? #### Table expectations ``` # columns df.expect_table_columns_to_match_ordered_list( column_list=["id", "created_on", "title", "description", "tag"]) # data leak df.expect_compound_columns_to_be_unique(column_list=["title", "description"]) ``` #### Column鏈熸湜 ``` # id df.expect_column_values_to_be_unique(column="id") # created_on df.expect_column_values_to_not_be_null(column="created_on") df.expect_column_values_to_match_strftime_format( column="created_on", strftime_format="%Y-%m-%d %H:%M:%S") # title df.expect_column_values_to_not_be_null(column="title") df.expect_column_values_to_be_of_type(column="title", type_="str") # description df.expect_column_values_to_not_be_null(column="description") df.expect_column_values_to_be_of_type(column="description", type_="str") # tag df.expect_column_values_to_not_be_null(column="tag") df.expect_column_values_to_be_of_type(column="tag", type_="str") ``` 鍙互灏嗘墍鏈夋湡鏈涚粍鍚堝湪涓€璧蜂互鍒涘缓涓€涓猍Expectation Suite](https://docs.greatexpectations.io/en/latest/reference/core_concepts/expectations/expectations.html#expectation-suites)瀵硅薄锛屽彲浠ヤ娇鐢ㄥ畠鏉ラ獙璇佷换浣曟暟鎹泦妯″潡銆? ``` # Expectation suite expectation_suite = df.get_expectation_suite(discard_failed_expectations=False) print(df.validate(expectation_suite=expectation_suite, only_return_failures=True)) ``` ``` { "success": true, "results": [], "statistics": { "evaluated_expectations": 11, "successful_expectations": 11, "unsuccessful_expectations": 0, "success_percent": 100.0 }, "evaluation_parameters": {} } ``` ### 椤圭洰 鍒扮洰鍓嶄负姝紝宸茬粡鍦ㄤ复鏃惰剼鏈?note绾у埆浣跨敤浜?Great Expectations 搴擄紝浣嗗彲浠ラ€氳繃鍒涘缓涓€涓」鐩潵杩涗竴姝ョ粍缁噒ransformers鏈熸湜銆? ``` cd tests great_expectations init ``` 杩欏皢寤虹珛涓€涓猔tests/great_expectations`鍏锋湁浠ヤ笅缁撴瀯鐨勭洰褰曪細 ``` tests/great_expectations/ 鈹溾攢鈹€ checkpoints/ 鈹溾攢鈹€ expectations/ 鈹溾攢鈹€ plugins/ 鈹溾攢鈹€ uncommitted/ 鈹溾攢鈹€ .gitignore 鈹斺攢鈹€ great_expectations.yml ``` #### 鏁版嵁婧? 绗竴姝ユ槸寤虹珛transformers`datasource`锛屽憡璇?Great Expectations transformers鏁版嵁鍦ㄥ摢閲岋細 ``` great_expectations datasource new ``` ``` What data would you like Great Expectations to connect to? 1. Files on a filesystem (for processing with Pandas or Spark) 馃憟 2. Relational database (SQL) ``` ``` What are you processing your files with? 1. Pandas 馃憟 2. PySpark ``` ``` Enter the path of the root directory where the data files are stored: ../data ``` #### Suites 鎵嬪姩銆佷氦浜掓垨鑷姩鍒涘缓鏈熸湜骞跺皢瀹冧滑淇濆瓨涓簊uite锛堝鐗瑰畾鏁版嵁assert鐨勪竴缁勬湡鏈涳級銆? ``` great_expectations suite new ``` ``` How would you like to create your Expectation Suite? 1. Manually, without interacting with a sample batch of data (default) 2. Interactively, with a sample batch of data 馃憟 3. Automatically, using a profiler ``` ``` Which data asset (accessible by data connector "default_inferred_data_connector_name") would you like to use? 1. labeled_projects.csv 2. projects.csv 馃憟 3. tags.csv ``` ``` Name the new Expectation Suite [projects.csv.warning]: projects ``` 杩欏皢鎵撳紑涓€涓氦浜掑紡note锛屽彲浠ュ湪鍏朵腑娣诲姞鏈熸湜銆傚鍒跺苟绮樿创涓嬮潰鐨勬湡鏈涘苟杩愯鎵€鏈夊崟鍏冩牸銆俙tags.csv`瀵瑰拰閲嶅姝ゆ楠labeled_projects.csv`銆? ![瀵勪簣鍘氭湜鐨勫鎴縘(https://upload-images.jianshu.io/upload_images/27840083-7f66bc4773236bf1.png) > Expectations for聽`projects.csv` > > Table expectations > > ``` > # Presence of features > validator.expect_table_columns_to_match_ordered_list( > column_list=["id", "created_on", "title", "description"]) > validator.expect_compound_columns_to_be_unique(column_list=["title", "description"]) # data leak > > ``` > Column expectations: > > ``` > # id > validator.expect_column_values_to_be_unique(column="id") > > # create_on > validator.expect_column_values_to_not_be_null(column="created_on") > validator.expect_column_values_to_match_strftime_format( > column="created_on", strftime_format="%Y-%m-%d %H:%M:%S") > > # title > validator.expect_column_values_to_not_be_null(column="title") > validator.expect_column_values_to_be_of_type(column="title", type_="str") > > # description > validator.expect_column_values_to_not_be_null(column="description") > validator.expect_column_values_to_be_of_type(column="description", type_="str") > > ``` > Expectations for聽`tags.csv` > > Table expectations > > ``` > # Presence of features > validator.expect_table_columns_to_match_ordered_list(column_list=["id", "tag"]) > > ``` > Column expectations: > > ``` > # id > validator.expect_column_values_to_be_unique(column="id") > > # tag > validator.expect_column_values_to_not_be_null(column="tag") > validator.expect_column_values_to_be_of_type(column="tag", type_="str") > > ``` > Expectations for聽`labeled_projects.csv` > > Table expectations > > ``` > # Presence of features > validator.expect_table_columns_to_match_ordered_list( > column_list=["id", "created_on", "title", "description", "tag"]) > validator.expect_compound_columns_to_be_unique(column_list=["title", "description"]) # data leak > > ``` > Column expectations: > > ``` > # id > validator.expect_column_values_to_be_unique(column="id") > > # create_on > validator.expect_column_values_to_not_be_null(column="created_on") > validator.expect_column_values_to_match_strftime_format( > column="created_on", strftime_format="%Y-%m-%d %H:%M:%S") > > # title > validator.expect_column_values_to_not_be_null(column="title") > validator.expect_column_values_to_be_of_type(column="title", type_="str") > > # description > validator.expect_column_values_to_not_be_null(column="description") > validator.expect_column_values_to_be_of_type(column="description", type_="str") > > # tag > validator.expect_column_values_to_not_be_null(column="tag") > validator.expect_column_values_to_be_of_type(column="tag", type_="str") > > ``` 鎵€鏈夎繖浜涙湡鏈涢兘淇濆瓨鍦╜great_expectations/expectations`锛? ``` great_expectations/ 鈹溾攢鈹€ expectations/ 鈹? 鈹溾攢鈹€ labeled_projects.csv 鈹? 鈹溾攢鈹€ projects.csv 鈹? 鈹斺攢鈹€ tags.csv ``` 杩樺彲浠ュ垪鍑簊uite锛? `great_expectations suite list` ``` Using v3 (Batch Request) API 3 Expectation Suites found: - labeled_projects - projects - tags ``` 瑕佺紪杈憇uite锛屽彲浠ユ墽琛屼互涓?CLI 鍛戒护锛? `great_expectations suite edit ` #### 妫€鏌ョ偣 鍒涘缓妫€鏌ョ偣锛屽叾涓皢涓€缁勬湡鏈涘簲鐢ㄤ簬鐗瑰畾鏁版嵁assert銆傝繖鏄竴绉嶄互缂栫▼鏂瑰紡鍦ㄧ幇鏈夌殑鍜屾柊鐨勬暟鎹簮涓婂簲鐢ㄦ鏌ョ偣鐨勫ソ鏂规硶銆? `cd tests great_expectations checkpoint new CHECKPOINT_NAME` 鎵€浠ュ浜巘ransformers椤圭洰锛屽畠灏嗘槸锛? ``` great_expectations checkpoint new projects great_expectations checkpoint new tags great_expectations checkpoint new labeled_projects ``` 杩欎簺妫€鏌ョ偣鍒涘缓璋冪敤涓殑姣忎竴涓兘灏嗗惎鍔ㄤ竴涓猲ote锛屽彲浠ュ湪鍏朵腑瀹氫箟瑕佸皢姝ゆ鏌ョ偣搴旂敤浜庡摢浜泂uite銆傚繀椤绘洿鏀筦data_asset_name`锛堣繍琛屾鏌ョ偣suite鐨勬暟鎹產ssert锛夊拰`expectation_suite_name`锛堣浣跨敤鐨剆uite鐨勫悕绉帮級鐨勮銆備緥濡傦紝`projects`妫€鏌ョ偣灏嗕娇鐢╜projects.csv`鏁版嵁assert鍜宍projects`suite銆? > 鍙鏋舵瀯鍜岄獙璇侀€傜敤锛屾鏌ョ偣灏卞彲浠ュ叡浜悓涓€涓猻uite銆? ``` my_checkpoint_name = "projects" # This was populated from your CLI command. yaml_config = f""" name: {my_checkpoint_name} config_version: 1.0 class_name: SimpleCheckpoint run_name_template: "%Y%m%d-%H%M%S-my-run-name-template" validations: - batch_request: datasource_name: local_data data_connector_name: default_inferred_data_connector_name data_asset_name: projects.csv data_connector_query: index: -1 expectation_suite_name: projects """ print(yaml_config) ``` > 楠岃瘉鑷姩濉厖 > > 涓€瀹氳纭繚`datasource_name`,`data_asset_name`鍜宍expectation_suite_name`閮芥槸甯屾湜瀹冧滑鎴愪负鐨勬牱瀛愶紙Great Expectations 鑷姩濉厖閭d簺鍙兘骞朵笉鎬绘槸鍑嗙‘鐨勫亣璁撅級銆? `tags`瀵瑰拰妫€鏌ョ偣閲嶅杩欎簺鐩稿悓鐨勬楠わ紝`labeled_projects`鐒跺悗灏卞彲浠ユ墽琛屽畠浠簡锛? ``` great_expectations checkpoint run projects great_expectations checkpoint run tags great_expectations checkpoint run labeled_projects ``` ![瀵勪簣鍘氭湜鐨勬鏌ョ珯](https://upload-images.jianshu.io/upload_images/27840083-14165996614fed0a.png) 鍦ㄦ湰璇剧粨鏉熸椂锛屽皢鍦╰ransformers鐩爣涓垱寤轰竴涓猔Makefile`杩愯鎵€鏈夎繖浜涙祴璇曪紙浠g爜銆佹暟鎹拰妯″瀷锛夌殑鐩爣锛屽苟涓斿皢鍦╰ransformers[棰勬彁浜よ绋媇(https://franztao.github.io/2022/10/26/Pre_commit/)涓嚜鍔ㄦ墽琛屽畠浠€? > note > > 宸茬粡瀵箃ransformers婧愭暟鎹泦搴旂敤浜嗛鏈燂紝浣嗚繕鏈夎澶氬叾浠栧叧閿鍩熼渶瑕佹祴璇曟暟鎹€備緥濡傦紝娓呮礂銆佹墿鍏呫€佹媶鍒嗐€侀澶勭悊銆佹爣璁板寲绛夎繃绋嬬殑涓棿杈撳嚭銆? ### 鏂囨。 褰撲娇鐢?CLI 搴旂敤绋嬪簭鍒涘缓鏈熸湜鏃讹紝Great Expectations 浼氳嚜鍔ㄤ负transformers娴嬭瘯鐢熸垚鏂囨。銆傚畠杩樺瓨鍌ㄦ湁鍏抽獙璇佽繍琛屽強鍏剁粨鏋滅殑淇℃伅銆傚彲浠ヤ娇鐢ㄤ互涓嬪懡浠ゅ惎鍔ㄧ敓鎴愭暟鎹枃妗o細`great_expectations docs build` ![鏁版嵁鏂囨。](https://upload-images.jianshu.io/upload_images/27840083-47f1aa6977b84948.png) > 榛樿鎯呭喌涓嬶紝Great Expectations 鍦ㄦ湰鍦板瓨鍌╰ransformers鏈熸湜銆佺粨鏋滃拰鎸囨爣锛屼絾瀵逛簬鐢熶骇锛岄渶瑕佽缃繙绋媅鍏冩暟鎹瓨鍌╙(https://docs.greatexpectations.io/docs/guides/setup/#metadata-stores)銆? ### 鐢熶骇 涓庡绔嬬殑 assert 璇彞鐩告瘮锛屼娇鐢ㄨ濡?great expectations 涔嬬被鐨勫簱鐨勪紭鍔垮湪浜庡彲浠ワ細 - 鍑忓皯璺ㄦ暟鎹ā寮忓垱寤烘祴璇曠殑鍐椾綑宸ヤ綔 - 鑷姩鍒涘缓娴嬭瘯[妫€鏌ョ偣](https://franztao.github.io/2022/10/01/Testing/#checkpoints)浠ラ殢鐫€transformers鏁版嵁闆嗗闀胯€屾墽琛? - 鑷姩鐢熸垚鍏充簬鏈熸湜鐨刐鏂囨。鍜岃繍琛屾姤鍛奭(https://franztao.github.io/2022/10/01/Testing/#documentation) - 杞绘澗杩炴帴鍚庣鏁版嵁婧愶紝濡傛湰鍦版枃浠剁郴缁熴€丼3銆佹暟鎹簱绛夈€? [鍦╰ransformersDataOps 宸ヤ綔娴乚(https://franztao.github.io/2022/11/10/Orchestration/#dataops)涓彁鍙栥€佸姞杞藉拰杞崲鏁版嵁鏃讹紝灏嗘墽琛屽叾涓澶氭湡鏈涖€傞€氬父锛屾暟鎹皢浠庢簮锛圼鏁版嵁搴揮(https://franztao.github.io/2022/11/10/Data_stack/#database)銆乕API](https://franztao.github.io/2022/10/01/RESTful_API/)绛夛級涓彁鍙栧苟鍔犺浇鍒版暟鎹郴缁燂紙渚嬪[鏁版嵁浠撳簱](https://franztao.github.io/2022/11/10/Data_stack/#data-warehouse)锛変腑锛岀劧鍚庡湪閭i噷杩涜杞崲锛堜緥濡備娇鐢╗dbt](https://www.getdbt.com/)锛変互渚涗笅娓稿簲鐢ㄧ▼搴忎娇鐢ㄣ€傚湪杩欎簺浠诲姟涓紝鍙互杩愯 Great Expectations 妫€鏌ョ偣楠岃瘉浠ョ‘淇濇暟鎹殑鏈夋晥鎬у拰搴旂敤浜庢暟鎹殑鏇存敼銆俒灏嗗湪缂栨帓璇剧▼](https://franztao.github.io/2022/11/10/Orchestration/#dataops)涓湅鍒颁竴涓畝鍖栫増鏈殑鏁版嵁楠岃瘉浣曟椂搴旇鍦╰ransformers鏁版嵁宸ヤ綔娴佷腑杩涜銆? ![鐢熶骇涓殑 ELT 娴佹按绾縘(https://upload-images.jianshu.io/upload_images/27840083-ab3a8c04aa359258.png) > 濡傛灉鎮ㄤ笉鐔熸倝涓嶅悓鐨勬暟鎹郴缁燂紝璇峰湪transformers[鏁版嵁鍫嗘爤璇剧▼](https://franztao.github.io/2022/11/10/Data_stack/)涓簡瑙f洿澶氫俊鎭€? ## model 娴嬭瘯 ML 绯荤粺鐨勬渶鍚庝竴涓柟闈㈡秹鍙婂湪璁粌銆佽瘎浼般€佹帹鐞嗗拰閮ㄧ讲鏈熼棿娴嬭瘯妯″瀷銆? ### 璁粌 甯屾湜鍦ㄥ紑鍙戣缁冪閬撴椂杩唬鍦扮紪鍐欐祴璇曪紝浠ヤ究鍙互蹇€熷彂鐜伴敊璇€傝繖涓€鐐瑰挨涓洪噸瑕侊紝鍥犱负涓庝紶缁熻蒋浠朵笉鍚岋紝ML 绯荤粺鍙互杩愯瀹屾垚鑰屼笉浼氬紩鍙戜换浣曞紓甯?閿欒锛屼絾鍙兘浼氫骇鐢熶笉姝g‘鐨勭郴缁熴€傝繕甯屾湜蹇€熸崟鑾烽敊璇互鑺傜渷鏃堕棿鍜岃绠椼€? - 妫€鏌ユā鍨嬭緭鍑虹殑褰㈢姸鍜屽€? ``` assert model(inputs).shape == torch.Size([len(inputs), num_classes]) ``` - 鍦ㄤ竴鎵硅缁冨悗妫€鏌ユ崯澶辨槸鍚﹀噺灏? ``` assert epoch_loss < prev_epoch_loss ``` - 鎵归噺杩囨嫙鍚? ``` accuracy = train(model, inputs=batches[0]) assert accuracy == pytest.approx(0.95, abs=0.05) # 0.95 卤 0.05 ``` - 璁粌瀹屾垚锛堟祴璇曟彁鍓嶅仠姝€佷繚瀛樼瓑锛? ``` train(model) assert learning_rate >= min_learning_rate assert artifacts ``` - 鍦ㄤ笉鍚岀殑璁惧涓? ``` assert train(model, device=torch.device("cpu")) assert train(model, device=torch.device("cuda")) ``` > note > > 鎮ㄥ彲浠ヤ娇鐢?pytest 鏍囪鏍囪璁$畻瀵嗛泦鍨嬫祴璇曪紝骞朵笖浠呭湪瀵瑰奖鍝嶆ā鍨嬬殑绯荤粺杩涜鏇存敼鏃舵墠鎵ц瀹冧滑銆? > > ``` > @pytest.mark.training > def test_train_model(): > ... > > ``` ### 琛屼负娴嬭瘯 琛屼负娴嬭瘯鏄祴璇曡緭鍏ユ暟鎹拰棰勬湡杈撳嚭鐨勮繃绋嬶紝鍚屾椂灏嗘ā鍨嬭涓洪粦鐩掞紙涓庢ā鍨嬫棤鍏崇殑璇勪及锛夈€傚畠浠笉涓€瀹氬湪鏈川涓婃槸瀵规姉鎬х殑锛屼絾鏇村鐨勬槸鍦ㄩ儴缃叉ā鍨嬪悗鍙兘鏈熸湜鍦ㄧ幇瀹炰笘鐣屼腑鐪嬪埌鐨勬壈鍔ㄧ被鍨嬨€傚叧浜庤繖涓富棰樼殑鍏锋湁閲岀▼纰戞剰涔夌殑璁烘枃鏄痆Beyond Accuracy: Behavioral Testing of NLP Models with CheckList](https://arxiv.org/abs/2005.04118)锛屽畠灏嗚涓烘祴璇曞垎涓轰笁绉嶇被鍨嬬殑娴嬭瘯锛? - `invariance`锛氭洿鏀逛笉搴斿奖鍝嶈緭鍑恒€? ``` # INVariance via verb injection (changes should not affect outputs) tokens = ["revolutionized", "disrupted"] texts = [f"Transformers applied to NLP have {token} the ML field." for token in tokens] predict.predict(texts=texts, artifacts=artifacts) ``` ``` ['natural-language-processing', 'natural-language-processing'] ``` - `directional`锛氬彉鍖栧簲璇ヤ細褰卞搷浜у嚭銆? ``` # DIRectional expectations (changes with known outputs) tokens = ["text classification", "image classification"] texts = [f"ML applied to {token}." for token in tokens] predict.predict(texts=texts, artifacts=artifacts) ``` ``` ['natural-language-processing', 'computer-vision'] ``` - `minimum functionality`锛氳緭鍏ュ拰棰勬湡杈撳嚭鐨勭畝鍗曠粍鍚堛€? ``` # Minimum Functionality Tests (simple input/output pairs) tokens = ["natural language processing", "mlops"] texts = [f"{token} is the next big wave in machine learning." for token in tokens] predict.predict(texts=texts, artifacts=artifacts) ``` ``` ['natural-language-processing', 'mlops'] ``` > 瀵规姉鎬ф祴璇? > > 杩欎簺绫诲瀷鐨勬祴璇曚腑鐨勬瘡涓€绉嶈繕鍙互鍖呮嫭瀵规姉鎬ф祴璇曪紝渚嬪浣跨敤甯歌鐨勬湁鍋忚鐨勪护鐗屾垨鍢堟潅鐨勪护鐗岃繘琛屾祴璇曠瓑銆? > > ``` > texts = [ > "CNNs for text classification.", # CNNs are typically seen in computer-vision projects > "This should not produce any relevant topics." # should predict `other` label > ] > predict.predict(texts=texts, artifacts=artifacts) > > ``` 鍙互灏嗚繖浜涙祴璇曡浆鎹负绯荤粺鐨勫弬鏁板寲娴嬭瘯锛? ``` mkdir tests/model touch tests/model/test_behavioral.py ``` ``` # tests/model/test_behavioral.py from pathlib import Path import pytest from config import config from tagifai import main, predict @pytest.fixture(scope="module") def artifacts(): run_id = open(Path(config.CONFIG_DIR, "run_id.txt")).read() artifacts = main.load_artifacts(run_id=run_id) return artifacts @pytest.mark.parametrize( "text_a, text_b, tag", [ ( "Transformers applied to NLP have revolutionized machine learning.", "Transformers applied to NLP have disrupted machine learning.", "natural-language-processing", ), ], ) def test_inv(text_a, text_b, tag, artifacts): """INVariance via verb injection (changes should not affect outputs).""" tag_a = predict.predict(texts=[text_a], artifacts=artifacts)[0]["predicted_tag"] tag_b = predict.predict(texts=[text_b], artifacts=artifacts)[0]["predicted_tag"] assert tag_a == tag_b == tag ``` 鏌ョ湅`tests/model/test_behavioral.py` ``` from pathlib import Path import pytest from config import config from tagifai import main, predict @pytest.fixture(scope="module") def artifacts(): run_id = open(Path(config.CONFIG_DIR, "run_id.txt")).read() artifacts = main.load_artifacts(run_id=run_id) return artifacts @pytest.mark.parametrize( "text, tag", [ ( "Transformers applied to NLP have revolutionized machine learning.", "natural-language-processing", ), ( "Transformers applied to NLP have disrupted machine learning.", "natural-language-processing", ), ], ) def test_inv(text, tag, artifacts): """INVariance via verb injection (changes should not affect outputs).""" predicted_tag = predict.predict(texts=[text], artifacts=artifacts)[0]["predicted_tag"] assert tag == predicted_tag @pytest.mark.parametrize( "text, tag", [ ( "ML applied to text classification.", "natural-language-processing", ), ( "ML applied to image classification.", "computer-vision", ), ( "CNNs for text classification.", "natural-language-processing", ) ], ) def test_dir(text, tag, artifacts): """DIRectional expectations (changes with known outputs).""" predicted_tag = predict.predict(texts=[text], artifacts=artifacts)[0]["predicted_tag"] assert tag == predicted_tag @pytest.mark.parametrize( "text, tag", [ ( "Natural language processing is the next big wave in machine learning.", "natural-language-processing", ), ( "MLOps is the next big wave in machine learning.", "mlops", ), ( "This should not produce any relevant topics.", "other", ), ], ) def test_mft(text, tag, artifacts): """Minimum Functionality Tests (simple input/output pairs).""" predicted_tag = predict.predict(texts=[text], artifacts=artifacts)[0]["predicted_tag"] assert tag == predicted_tag ``` ### 鎺ㄧ悊 閮ㄧ讲妯″瀷鍚庯紝澶у鏁扮敤鎴峰皢浣跨敤瀹冭繘琛屾帹鐞嗭紙鐩存帴/闂存帴锛夛紝鍥犳娴嬭瘯瀹冪殑鍚勪釜鏂归潰闈炲父閲嶈銆? #### 鍔犺浇artifacts 杩欐槸绗竴娆′笉浠庡唴瀛樹腑鍔犺浇缁勪欢锛屽洜姝ゅ笇鏈涚‘淇濇墍闇€鐨勫伐浠讹紙妯″瀷鏉冮噸銆佺紪鐮佸櫒銆侀厤缃瓑锛夐兘鑳藉琚姞杞姐€? ``` artifacts = main.load_artifacts(run_id=run_id) assert isinstance(artifacts["label_encoder"], data.LabelEncoder) ... ``` #### 棰勮█ 涓€鏃﹀姞杞戒簡宸ヤ欢锛屽氨鍑嗗濂芥祴璇曢娴嬬閬撱€傚簲璇ュ彧鐢ㄤ竴涓緭鍏ュ拰涓€鎵硅緭鍏ユ潵娴嬭瘯鏍锋湰锛堜緥濡傦紝濉厖鏈夋椂浼氫骇鐢熸剰鎯充笉鍒扮殑鍚庢灉锛夈€? ``` # test our API call directly data = { "texts": [ {"text": "Transfer learning with transformers for text classification."}, {"text": "Generative adversarial networks in both PyTorch and TensorFlow."}, ] } response = client.post("/predict", json=data) assert response.status_code == HTTPStatus.OK assert response.request.method == "POST" assert len(response.json()["data"]["predictions"]) == len(data["texts"]) ... ``` ## 鐢熸垚鏂囦欢 璁╁湪鍏朵腑鍒涘缓涓€涓洰鏍囷紝`Makefile`杩欏皢鍏佽涓€娆¤皟鐢ㄦ墽琛屾墍鏈夋祴璇曪細 ``` # Test .PHONY: test test: pytest -m "not training" cd tests && great_expectations checkpoint run projects cd tests && great_expectations checkpoint run tags cd tests && great_expectations checkpoint run labeled_projects ``` ``` make test ``` ## 娴嬭瘯涓庣洃鎺? 鏈€鍚庯紝灏嗚璁烘祴璇曞拰[鐩戞帶](https://franztao.github.io/2022/10/01/Testing//../monitoring/)涔嬮棿鐨勭浉浼肩偣鍜屽尯鍒€傚畠浠兘鏄?ML 寮€鍙戠閬撶殑缁勬垚閮ㄥ垎锛屽苟涓旂浉浜掍緷璧栦互杩涜杩唬銆傛祴璇曞彲纭繚绯荤粺锛堜唬鐮併€佹暟鎹拰妯″瀷锛夎揪鍒板湪绂荤嚎鏃跺缓绔嬬殑棰勬湡銆傞壌浜庣洃鎺ф秹鍙婅繖浜涙湡鏈涚户缁湪绾夸紶閫掑疄鏃剁敓浜ф暟鎹紝鍚屾椂杩橀€氳繃浠ヤ笅鏂瑰紡纭繚鍏舵暟鎹垎甯僛涓嶿(https://franztao.github.io/2022/10/01/Testing//../monitoring/#measuring-drift)鍙傝€冪獥鍙o紙閫氬父鏄缁冩暟鎹殑瀛愰泦锛夊叿鏈夊彲姣旀€у惃n. 褰撹繖浜涙潯浠朵笉鍐嶆垚绔嬫椂锛岄渶瑕佹洿浠旂粏鍦版鏌ワ紙鍐嶅煿璁彲鑳藉苟涓嶆€昏兘瑙e喅鏍规湰闂锛夈€? 瀵逛簬[鐩戞帶](https://franztao.github.io/2022/10/01/Testing//../monitoring/)锛屽湪娴嬭瘯鏈熼棿涓嶅繀鑰冭檻寰堝涓嶅悓鐨勯棶棰橈紝鍥犱负瀹冩秹鍙婂皻鏈湅鍒扮殑锛堝疄鏃讹級鏁版嵁銆? - 鐗瑰緛鍜岄娴嬪垎甯冿紙婕傜Щ锛夈€佺被鍨嬨€佹ā寮忎笉鍖归厤绛夈€? - 浣跨敤闂存帴淇″彿锛堝洜涓烘爣绛惧彲鑳戒笉瀹规槗鑾峰緱锛夌‘瀹氭ā鍨嬫€ц兘锛堟暣浣撳拰鏁版嵁鍒囩墖鐨勬粴鍔ㄥ拰绐楀彛搴﹂噺锛夈€? - 鍦ㄥぇ鏁版嵁鐨勬儏鍐典笅锛岄渶瑕佺煡閬撹鏍囪鍝簺鏁版嵁鐐瑰苟杩涜涓婇噰鏍蜂互杩涜璁粌銆? - 璇嗗埆寮傚父鍜屽紓甯稿€笺€? > [灏嗗湪鐩戞帶](https://franztao.github.io/2022/10/01/Testing//../monitoring/)璇剧▼涓洿娣卞叆鍦帮紙鍜屼唬鐮侊級浠嬬粛鎵€鏈夎繖浜涙蹇点€? ## 璧勬簮 - [Great Expectations](https://github.com/great-expectations/great_expectations) - [The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf) - [Beyond Accuracy: Behavioral Testing of NLP Models with CheckList](https://arxiv.org/abs/2005.04118) - [Robustness Gym: Unifying the NLP Evaluation Landscape](https://arxiv.org/abs/2101.04840) 鏇村骞茶揣锛岀涓€鏃堕棿鏇存柊鍦ㄤ互涓嬪井淇″叕浼楀彿锛? ![](https://raw.githubusercontent.com/franztao/blog_picture/main/marktext/2022-12-03-12-49-27-weixin.png) 鎮ㄧ殑涓€鐐圭偣鏀寔锛屾槸鎴戝悗缁洿澶氱殑鍒涢€犲拰璐$尞 ![](https://upload-images.jianshu.io/upload_images/27840083-e458640766afb594.png) 杞浇鍒拌鍖呮嫭鏈枃鍦板潃 鏇磋缁嗙殑杞浇浜嬪疁璇峰弬鑰僛鏂囩珷濡備綍杞浇/寮曠敤](https://franztao.github.io/2022/12/04/%E6%96%87%E7%AB%A0%E5%A6%82%E4%BD%95%E8%BD%AC%E8%BD%BD%E5%92%8C%E5%BC%95%E7%94%A8/) 鏈枃涓讳綋婧愯嚜浠ヤ笅閾炬帴锛? ``` @article{madewithml, author = {Goku Mohandas}, title = { Made With ML }, howpublished = {\url{https://madewithml.com/}}, year = {2022} } ``` 鏈枃鐢盵mdnice](https://mdnice.com/?platform=6)澶氬钩鍙板彂甯?

https://www.xamrdz.com/backend/3at1916293.html

相关文章: