팬더에 JSON이 두 개 이상 포함된 파일 로드

source

팬더에 JSON이 두 개 이상 포함된 파일 로드

bestscript 2023. 3. 16. 21:36

팬더에 JSON이 두 개 이상 포함된 파일 로드

Python panda(0.14.0) 데이터 프레임에 JSON 파일을 읽으려고 합니다.JSON 파일의 첫 번째 행은 다음과 같습니다.

{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "P_Mk0ygOilLJo4_WEvabAA", "review_id": "OeT5kgUOe3vcN7H6ImVmZQ", "stars": 3, "date": "2005-08-26", "text": "This is a pretty typical cafe.  The sandwiches and wraps are good but a little overpriced and the food items are the same.  The chicken caesar salad wrap is my favorite here but everything else is pretty much par for the course.", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}

다음을 수행하려고 합니다.df = pd.read_json(path).

다음 오류가 발생하였습니다(완전 트레이스백 포함).

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 198, in read_json
    date_unit).parse()
  File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 266, in parse
    self._parse_no_numpy()
  File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 483, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Trailing data

이 뭐죠?Trailing data오류? 데이터 프레임에 어떻게 읽어야 합니까?

몇 가지 권장 사항에 따라 .json 파일의 몇 줄을 다음에 나타냅니다.

{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "P_Mk0ygOilLJo4_WEvabAA", "review_id": "OeT5kgUOe3vcN7H6ImVmZQ", "stars": 3, "date": "2005-08-26", "text": "This is a pretty typical cafe.  The sandwiches and wraps are good but a little overpriced and the food items are the same.  The chicken caesar salad wrap is my favorite here but everything else is pretty much par for the course.", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "TNJRTBrl0yjtpAACr1Bthg", "review_id": "qq3zF2dDUh3EjMDuKBqhEA", "stars": 3, "date": "2005-11-23", "text": "I agree with other reviewers - this is a pretty typical financial district cafe.  However, they have fantastic pies.  I ordered three pies for an office event (apple, pumpkin cheesecake, and pecan) - all were delicious, particularly the cheesecake.  The sucker weighed in about 4 pounds - no joke.\n\nNo surprises on the cafe side - great pies and cakes from the catering business.", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "H_mngeK3DmjlOu595zZMsA", "review_id": "i3eQTINJXe3WUmyIpvhE9w", "stars": 3, "date": "2005-11-23", "text": "Decent enough food, but very overpriced. Just a large soup is almost $5. Their specials are $6.50, and with an overpriced soda or juice, it's approaching $10. A bit much for a cafe lunch!", "type": "review", "business_id": "Jp9svt7sRT4zwdbzQ8KQmw"}

사용하고 있는 이 .json 파일에는 사양에 따라 각 행에 1개의 JSON 개체가 포함되어 있습니다.

제안하신 대로 jsonlint.com 웹사이트를 시험해 보니 다음과 같은 오류가 나타납니다.

Parse error on line 14:
...t7sRT4zwdbzQ8KQmw"}{    "votes": {
----------------------^
Expecting 'EOF', '}', ',', ']'

팬더 버전 0.19.0부터는lines다음과 같은 파라미터:

import pandas as pd

data = pd.read_json('/path/to/file.json', lines=True)

한 줄씩 읽어야 돼요.예를 들어 reddit에서 ryptophan이 제공하는 다음 코드를 사용할 수 있습니다.

import pandas as pd

# read the entire file into a python array
with open('your.json', 'rb') as f:
    data = f.readlines()

# remove the trailing "\n" from each line
data = map(lambda x: x.rstrip(), data)

# each element of 'data' is an individual JSON object.
# i want to convert it into an *array* of JSON objects
# which, in and of itself, is one large JSON object
# basically... add square brackets to the beginning
# and end, and have all the individual business JSON objects
# separated by a comma
data_json_str = "[" + ','.join(data) + "]"

# now, load it into pandas
data_df = pd.read_json(data_json_str)

다음 코드가 로드하는 데 도움이 되었습니다.JSON로 만족하다.dataframe:

import json
import pandas as pd

with open('Appointment.json', encoding="utf8") as f:
    data = f.readlines()
    data = [json.loads(line) for line in data] #convert string to dict format
df = pd.read_json(data) # Load into dataframe

저도 같은 문제에 직면했습니다.데이터가 '\n'과 같이 엔드라인으로 구분된 행에 기록될 때 발생합니다. 먼저 각 행을 읽고 각 행을 파이썬 내장 유형으로 변환해야 합니다.다음과 같은 방법으로 해결했습니다.

with open("/path/to/file") as f:
    content = f.readlines()

data = [eval(c) for c in content]
data = pd.DataFrame(data)

행운을 빕니다.

저도 비슷한 문제가 있었어요.

알고 보니pd.read_json(myfile.json)는 상위 폴더를 자동으로 검색하지만 파일과 같은 폴더에 없는 경우 이 '데이터 삭제' 오류를 반환합니다.

난 알아냈어, 왜냐면 내가 그걸 하려고 했을 때open('myfile.json', 'r'), 그리고 저는FileNotFound에러가 나서 경로를 확인했습니다.

myfile.json을 노트북과 같은 폴더로 이동하지 못했습니다.

로 변경하다pd.read_json('../myfile.json')방금 작동했어

언급URL : https://stackoverflow.com/questions/30088006/loading-a-file-with-more-than-one-line-of-json-into-pandas

'source' 카테고리의 다른 글

Wordpress localhost 설치 오류 - PHP 설치에 WordPress에 필요한 MySQL 확장자가 없는 것 같습니다. (0)	2023.03.21
Oracle SQL, 여러 열 연결 + 텍스트 추가 (0)	2023.03.16
스프링 부트 응용 프로그램의 메인 클래스를 테스트하는 방법 (0)	2023.03.16
유형 스크립트로 반응 - React.forwardRef 사용 시 일반 정보 (0)	2023.03.16
공급자가 Oracle 클라이언트 버전과 호환되지 않습니다. (0)	2023.03.16

현재글팬더에 JSON이 두 개 이상 포함된 파일 로드

각종 프로그래밍 정보를 다루는 블로그입니다.

Python, Vuex, C, JavaScript, AJAX, oracle, MariaDB, MongoDB, AngularJS, php, spring-boot, spring, Azure, MySQL, REACTJS, java, Wordpress, json, vuejs2, typeScript,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

bestscript

팬더에 JSON이 두 개 이상 포함된 파일 로드

팬더에 JSON이 두 개 이상 포함된 파일 로드

'source' 카테고리의 다른 글

'source'의 다른글

티스토리툴바

팬더에 JSON이 두 개 이상 포함된 파일 로드

팬더에 JSON이 두 개 이상 포함된 파일 로드

'source' 카테고리의 다른 글

'source'의 다른글

관련글

티스토리툴바