I have these two dataframes, in the first one, I have category2 and category3 , while category1 is missing, I need to fill this for each , year, each month, each class and each region based on df2. Note the data has several months, several months and years, o n l y p a s t e d a n e x c e r p t h e r e . < / p > < b r / > H e r e i s d f 1 < / c o d e > < / p > < b r / > < d i v c l a s s = " s - t a b l e - c o n t a i n e r " > < t a b l e c l a s s = " s - t a b l e " > < b r / > < t h e a d > < b r / > < t r > < b r / > < t h > Y e a r < / t h > < b r / > < t h > M o n t h < / t h > < b r / > < t h > C l a s s < / t h > < b r / > < t h > R e g i o n < / t h > < b r / > < t h > C a t e g o r y 2 < / t h > < b r / > < t h > C a t e g o r y 3 < / t h > < b r / > < t h > V o l < / t h > < b r / > < / t r > < b r / > < / t h e a d > < b r / > < t b o d y > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 1 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 1 < / t d > < b r / > < t d > 1 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 1 < / t d > < b r / > < t d > S 2 < / t d > < b r / > < t d > F 1 < / t d > < b r / > < t d > 3 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 2 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 1 < / t d > < b r / > < t d > 3 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 2 < / t d > < b r / > < t d > S 2 < / t d > < b r / > < t d > F 1 < / t d > < b r / > < t d > 4 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 3 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 4 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 4 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 1 2 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 4 < / t d > < b r / > < t d > S 2 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 4 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 5 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 1 0 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 5 < / t d > < b r / > < t d > S 2 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 6 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 7 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 7 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 7 < / t d > < b r / > < t d > S 2 < / t d > < b r / > < t d > F 2 < / t d > < b r / > < t d > 2 < / t d > < b r / > < / t r > < b r / > < t r > < b r / > < t d > 2 0 2 2 < / t d > < b r / > < t d > 1 < / t d > < b r / > < t d > A A < / t d > < b r / > < t d > R 8 < / t d > < b r / > < t d > S 1 < / t d > < b r / > < t d > F 2 < / t d>
70
2022
1
AA
R8
S2
F2
2
2022
1
AA
R8
S1
F1
5
2022
1
AA
R8
S1
F2
2
2022
1
AA
R8
S2
F1
10
2022
1
AA
R8
S2
F2
1
2022
1
AA
R9
S1
F1
3
2022
1
AA
R9
S1
F2
5
2022
1
AA
R9
S2
F1
3
Here is df2
Year
Month
Class
Region
Category1
Category2
Category3
Vol
2022
1
AA
R1
Shift1
S1
F1
1
2022
1
AA
R1
Shift1
S2
F1
3
2022
1
AA
R2
Shift1
S1
F2
1
2022
1
AA
R2
Shift1
S2
F1
5
2022
1
AA
R3
Shift2
S2
F1
4
2022
1
AA
R3
Shift1
S1
F1
48
2022
1
AA
R3
Shift1
S1
F2
37
2022
1
AA
R3
Shift1
S1
F3
5
2022
1
AA
R3
Shift1
S2
F1
248
2022
1
AA
R3
Shift1
S2
F2
3
2022
1
AA
R3
Shift1
S2
F3
2
2022
1
AA
R4
Shift2
S1
F2
7
2022
1
AA
R4
Shift1
S1
F2
100
2022
1
AA
R4
Shift1
S1
F3
6
2022
1
AA
R4
Shift1
S2
F1
154
2022
1
AA
R4
Shift1
S2
F2
45
2022
1
AA
R4
Shift1
S2
F3
35
2022
1
AA
R5
Shift2
S1
F1
2
2022
1
AA
R5
Shift2
S1
F2
8
2022
1
AA
R5
Shift2
S2
F1
3
2022
1
AA
R5
Shift1
S1
F1
30
So I need to take the category1 distribution for the same year, same month , same class and same region from df2 and split the volume in df1 as per that
For example, if we look at row 6, the total volume 2022, January, Class AA, Region R4 Category2, S1, Category 3 F2 is 12
Year
Month
Class
Region
Category2
Category3
Vol
2022
1
AA
R4
S1
F2
12
From df2 , we we see that, for the same month, year, category 2 and category3, the split between category1 is 7% and 93%
Year
Month
Class
Region
Category1
Category2
Category3
Vol
2022
1
AA
R4
Shift2
S1
F2
7
2022
1
AA
R4
Shift1
S1
F2
100
I need to then regernate the rows in table 1 so that the 12 is split in this this manner with a new category1 column
so this
Year
Month
Class
Region
Category2
Category3
Vol
2022
1
AA
R4
S1
F2
12
would become this
Year
Month
Class
Region
Category1
Category2
Category3
Vol
2022
1
AA
R4
Shift1
S1
F2
1
2022
1
AA
R4
Shift2
S1
F2
11
I also need to round it so that the individual values I whole numbers without exceeding the total of 12.
I have no idea how to do this in python though I suspect I would have to use groupby for part of it
Подробнее здесь: https://stackoverflow.com/questions/790 ... from-anoth
Как сгенерировать недостающие данные в одном кадре данных на основе распределения из другого кадра данных ⇐ Python
-
- Похожие темы
- Ответы
- Просмотры
- Последнее сообщение