Koushik Chandra Koushik Chandra - 1 year ago 143
Bash Question

dropping hive partition based on certain condition in runtime

I have a table in hive like

create table t1 (x int, y int, s string) partitioned by (wk int) stored as sequencefile;

The table is having the below data

select * from t1;
| t1.x | t1.y | t1.s | t1.wk |
| 1 | 2 | abc | 10 |
| 4 | 5 | xyz | 11 |
| 7 | 8 | pqr | 12 |

Now the ask is to drop the oldest partition when partition count is

Can this be handled in hql or through any shell script and how?

Answer Source

If your partitions are ordered by date, you could write a shell script in which you could use hive -e 'SHOW PARTITIONS t1' to get all partitions, in your example, it will return:


Then you can issue hive -e 'ALTER TABLE t1 DROP PARTITION (wk=10)' to remove the first partition;

So something like:

if (( `hive -e 'SHOW PARTITIONS t1' | grep wk | wc -l` < 2)) ; then
partition=`hive -e 'SHOW PARTITIONS t1' | grep wk | head -1`;
hive -e 'ALTER TABLE t1 DROP PARTITION ($partition)';
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download