SSLCertVerificationError при получении наборов данных OpenML в среде conda в Linux (ubuntu)Python

Программы на Python
Ответить
Anonymous
 SSLCertVerificationError при получении наборов данных OpenML в среде conda в Linux (ubuntu)

Сообщение Anonymous »

Я работаю в Linux (точнее, в Ubuntu), используя среду conda (ML(Python 3.12.9)), и получаю ошибку проверки сертификата SSL при попытке загрузить набор данных из OpenML с помощью sklearn.datasets.fetch_openml. Я уже установил сертификат внутри среды, но ошибка не устранена.
Я выполнил:
from sklearn.datasets import fetch_openml
har = fetch_openml(name="HAR", as_frame=False)

и я получил следующие ошибки:
SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED]
certificate verify failed: Hostname mismatch, certificate is not valid for 'api.openml.org'. (_ssl.c:1010)
URLError:

Я также пытался получить другие наборы данных и столкнулся с той же ошибкой.
Я пробовал все, что мне говорил ИИ.
Что я уже пробовал:
  • Установил сертификат в среде ML: pip install certifi (он сообщает: «Требование уже удовлетворено»).
  • Проверено, что использование certifi явно позволяет urllib получать https://example.com:
    import urllib.request, ssl, certifi
    ctx = ssl.create_default_context(cafile=certifi.where())
    urllib.request.urlopen("https://example.com", context=ctx) # works
Обратная трассировка указывает на Python 3.12 внутри моей среды conda: ссылка на пути ~/anaconda3/envs/ML/lib/python3.12/...
Чего я не знаю/что мне нужно
  • Почему fetch_openml (через scikit-learn) завершается сбоем из-за несоответствия имени хоста для api.openml.org, хотя сертификат certifi установлен, и я могу получить другие сайты HTTPS с помощью пакета certifi.
  • Независимо от того, вызвано ли это проблемой локального прокси/промежуточного блока/DNS, проблемой сертификата сервера OpenML или несоответствием в методе scikit-learn / urllib собирает пакеты CA внутри моей среды conda.
Диагностическая информация, которую я могу добавить, если она полезна

Я могу вставить результаты команд ниже, скажите, какие из них вам нужны:
python -c "import certifi; print(certifi.where())"
conda list scikit-learn
echo $SSL_CERT_FILE
# TLS/SSL certificate details from the server:
openssl s_client -connect api.openml.org:443 -servername api.openml.org /dev/null | openssl x509 -noout -text
# or a quick curl:
curl -vI https://api.openml.org

Я также добавил
export SSL_CERT_FILE=$(python -c "import certifi; print(certifi.where())")

в мой ~/.bashrc и снова загрузил ядро, но ничего не произошло
Необработанный результат:
---------------------------------------------------------------------------
SSLCertVerificationError Traceback (most recent call last)
File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:1344, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1343 try:
-> 1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error

File ~/anaconda3/envs/ML/lib/python3.12/http/client.py:1338, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1337 """Send a complete request to the server."""
-> 1338 self._send_request(method, url, body, headers, encode_chunked)

File ~/anaconda3/envs/ML/lib/python3.12/http/client.py:1384, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1383 body = _encode(body, 'body')
-> 1384 self.endheaders(body, encode_chunked=encode_chunked)

File ~/anaconda3/envs/ML/lib/python3.12/http/client.py:1333, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1332 raise CannotSendHeader()
-> 1333 self._send_output(message_body, encode_chunked=encode_chunked)

File ~/anaconda3/envs/ML/lib/python3.12/http/client.py:1093, in HTTPConnection._send_output(self, message_body, encode_chunked)
1092 del self._buffer[:]
-> 1093 self.send(msg)
1095 if message_body is not None:
1096
1097 # create a consistent interface to message_body

File ~/anaconda3/envs/ML/lib/python3.12/http/client.py:1037, in HTTPConnection.send(self, data)
1036 if self.auto_open:
-> 1037 self.connect()
1038 else:

File ~/anaconda3/envs/ML/lib/python3.12/http/client.py:1479, in HTTPSConnection.connect(self)
1477 server_hostname = self.host
-> 1479 self.sock = self._context.wrap_socket(self.sock,
1480 server_hostname=server_hostname)

File ~/anaconda3/envs/ML/lib/python3.12/ssl.py:455, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
449 def wrap_socket(self, sock, server_side=False,
450 do_handshake_on_connect=True,
451 suppress_ragged_eofs=True,
452 server_hostname=None, session=None):
453 # SSLSocket class handles server_hostname encoding before it calls
454 # ctx._wrap_socket()
--> 455 return self.sslsocket_class._create(
456 sock=sock,
457 server_side=server_side,
458 do_handshake_on_connect=do_handshake_on_connect,
459 suppress_ragged_eofs=suppress_ragged_eofs,
460 server_hostname=server_hostname,
461 context=self,
462 session=session
463 )

File ~/anaconda3/envs/ML/lib/python3.12/ssl.py:1041, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
1040 raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1041 self.do_handshake()
1042 except:

File ~/anaconda3/envs/ML/lib/python3.12/ssl.py:1319, in SSLSocket.do_handshake(self, block)
1318 self.settimeout(None)
-> 1319 self._sslobj.do_handshake()
1320 finally:

SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'api.openml.org'. (_ssl.c:1010)

During handling of the above exception, another exception occurred:

URLError Traceback (most recent call last)
Cell In[8], line 1
----> 1 har = fetch_openml(name="HAR", as_frame=False)

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/utils/_param_validation.py:216, in validate_params..decorator..wrapper(*args, **kwargs)
210 try:
211 with config_context(
212 skip_parameter_validation=(
213 prefer_skip_nested_validation or global_skip_validation
214 )
215 ):
--> 216 return func(*args, **kwargs)
217 except InvalidParameterError as e:
218 # When the function is just a wrapper around an estimator, we allow
219 # the function to delegate validation to the estimator, but we replace
220 # the name of the estimator by the name of the function in the error
221 # message to avoid confusion.
222 msg = re.sub(
223 r"parameter of \w+ must be",
224 f"parameter of {func.__qualname__} must be",
225 str(e),
226 )

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:1011, in fetch_openml(name, version, data_id, data_home, target_column, cache, return_X_y, as_frame, n_retries, delay, parser, read_csv_kwargs)
1005 if data_id is not None:
1006 raise ValueError(
1007 "Dataset data_id={} and name={} passed, but you can only "
1008 "specify a numeric data_id or a name, not "
1009 "both.".format(data_id, name)
1010 )
-> 1011 data_info = _get_data_info_by_name(
1012 name, version, data_home, n_retries=n_retries, delay=delay
1013 )
1014 data_id = data_info["did"]
1015 elif data_id is not None:
1016 # from the previous if statement, it is given that name is None

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:302, in _get_data_info_by_name(name, version, data_home, n_retries, delay)
300 url = _SEARCH_NAME.format(name) + "/status/active/"
301 error_msg = "No active dataset {} found.".format(name)
--> 302 json_data = _get_json_content_from_openml_api(
303 url,
304 error_msg,
305 data_home=data_home,
306 n_retries=n_retries,
307 delay=delay,
308 )
309 res = json_data["data"]["dataset"]
310 if len(res) > 1:

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:246, in _get_json_content_from_openml_api(url, error_message, data_home, n_retries, delay)
243 return json.loads(response.read().decode("utf-8"))
245 try:
--> 246 return _load_json()
247 except HTTPError as error:
248 # 412 is an OpenML specific error code, indicating a generic error
249 # (e.g., data not found)
250 if error.code != 412:

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:67, in _retry_with_clean_cache..decorator..wrapper(*args, **kw)
65 return f(*args, **kw)
66 try:
---> 67 return f(*args, **kw)
68 except URLError:
69 raise

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:241, in _get_json_content_from_openml_api.._load_json()
238 @_retry_with_clean_cache(url, data_home=data_home)
239 def _load_json():
240 with closing(
--> 241 _open_openml_url(url, data_home, n_retries=n_retries, delay=delay)
242 ) as response:
243 return json.loads(response.read().decode("utf-8"))

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:173, in _open_openml_url(openml_path, data_home, n_retries, delay)
166 try:
167 # Create a tmpdir as a subfolder of dir_name where the final file will
168 # be moved to if the download is successful. This guarantees that the
169 # renaming operation to the final location is atomic to ensure the
170 # concurrence safety of the dataset caching mechanism.
171 with TemporaryDirectory(dir=dir_name) as tmpdir:
172 with closing(
--> 173 _retry_on_network_error(n_retries, delay, req.full_url)(urlopen)(
174 req
175 )
176 ) as fsrc:
177 opener: Callable
178 if is_gzip_encoded(fsrc):

File ~/anaconda3/envs/ML/lib/python3.12/site-packages/sklearn/datasets/_openml.py:103, in _retry_on_network_error..decorator..wrapper(*args, **kwargs)
101 while True:
102 try:
--> 103 return f(*args, **kwargs)
104 except (URLError, TimeoutError) as e:
105 # 412 is a specific OpenML error code.
106 if isinstance(e, HTTPError) and e.code == 412:

File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:215, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
213 else:
214 opener = _opener
--> 215 return opener.open(url, data, timeout)

File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:515, in OpenerDirector.open(self, fullurl, data, timeout)
512 req = meth(req)
514 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 515 response = self._open(req, data)
517 # post-process response
518 meth_name = protocol+"_response"

File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:532, in OpenerDirector._open(self, req, data)
529 return result
531 protocol = req.type
--> 532 result = self._call_chain(self.handle_open, protocol, protocol +
533 '_open', req)
534 if result:
535 return result

File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:492, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
490 for handler in handlers:
491 func = getattr(handler, meth_name)
--> 492 result = func(*args)
493 if result is not None:
494 return result

File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:1392, in HTTPSHandler.https_open(self, req)
1391 def https_open(self, req):
-> 1392 return self.do_open(http.client.HTTPSConnection, req,
1393 context=self._context)

File ~/anaconda3/envs/ML/lib/python3.12/urllib/request.py:1347, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1344 h.request(req.get_method(), req.selector, req.data, headers,
1345 encode_chunked=req.has_header('Transfer-encoding'))
1346 except OSError as err: # timeout error
-> 1347 raise URLError(err)
1348 r = h.getresponse()
1349 except:

URLError:


Подробнее здесь: https://stackoverflow.com/questions/798 ... nment-on-l
Ответить

Быстрый ответ

Изменение регистра текста: 
Смайлики
:) :( :oops: :roll: :wink: :muza: :clever: :sorry: :angel: :read: *x)
Ещё смайлики…
   
К этому ответу прикреплено по крайней мере одно вложение.

Если вы не хотите добавлять вложения, оставьте поля пустыми.

Максимально разрешённый размер вложения: 15 МБ.

Вернуться в «Python»