Skip to content

Commit 789969a

Browse files
usernamedtLeonid Borchuk
authored andcommitted
Movable DataBase Locales for Cloudberry
We inherited this issue from PostgreSQL. PostgreSQL uses glibc to sort strings. In version glibc=2.28, collations broke down badly (in general, there are no guarantees when updating glibc). Changing collations breaks indexes. Similarly, a cluster with different collations also behaves unpredictably. What and when something has changed in glibc can be found on https://github.com/ardentperf/glibc-unicode-sorting Also there is special postgresql-wiki https://wiki.postgresql.org/wiki/Locale_data_changes And you tube video https://www.youtube.com/watch?v=0E6O-V8Jato In short, the issue can be seen through the use of bash: ( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.UTF-8 sort gives the different results in ubunru 18.04 and 22.04. There is no way to solve the problem other than by not changing the symbol order. We freeze symbol order and use it instead of glibc. Here the solution https://github.com/postgredients/mdb-locales. In this PR I have added PostgreSQL patch that replaces all glibc locale-related calls with a calls to an external libary. It activates using new configure parameter --with-mdblocales, which is off by default. Using custom locales needs libmdblocales1 package and mdb-locales package with symbol table. Build needs libmdblocales-dev package with headers.
1 parent f42e3e9 commit 789969a

File tree

3 files changed

+284
-1
lines changed

3 files changed

+284
-1
lines changed

devops/build/automation/cloudberry/scripts/configure-cloudberry.sh

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,12 @@
6262
# --enable-cassert
6363
# --enable-debug-extensions
6464
#
65+
# ENABLE_MDBLOCALES - Enable custom locales (true/false, defaults to
66+
# false)
67+
#
68+
# When true, add option:
69+
# --with-mdblocales
70+
#
6571
# Prerequisites:
6672
# - System dependencies must be installed:
6773
# * xerces-c development files
@@ -132,6 +138,11 @@ if [ "${ENABLE_DEBUG:-false}" = "true" ]; then
132138
--enable-debug-extensions"
133139
fi
134140

141+
CONFIGURE_MDBLOCALES_OPTS="--without-mdblocales"
142+
if [ "${ENABLE_MDBLOCALES:-false}" = "true" ]; then
143+
CONFIGURE_MDBLOCALES_OPTS="--with-mdblocales"
144+
fi
145+
135146
# Configure build
136147
log_section "Configure"
137148
execute_cmd ./configure --prefix=${BUILD_DESTINATION} \
@@ -158,6 +169,7 @@ execute_cmd ./configure --prefix=${BUILD_DESTINATION} \
158169
--with-ssl=openssl \
159170
--with-openssl \
160171
--with-uuid=e2fs \
172+
${CONFIGURE_MDBLOCALES_OPTS} \
161173
--with-includes=/usr/local/xerces-c/include \
162174
--with-libraries=${BUILD_DESTINATION}/lib || exit 4
163175
log_section_end "Configure"

src/test/regress/output/misc.source

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -613,6 +613,6 @@ CONTEXT: SQL function "equipment" during startup
613613
SELECT mdb_locale_enabled();
614614
mdb_locale_enabled
615615
--------------------
616-
t
616+
f
617617
(1 row)
618618

src/test/regress/sql/misc.sql

Lines changed: 271 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,271 @@
1+
--
2+
-- MISC
3+
--
4+
5+
--
6+
-- BTREE
7+
--
8+
--UPDATE onek
9+
-- SET unique1 = onek.unique1 + 1;
10+
11+
--UPDATE onek
12+
-- SET unique1 = onek.unique1 - 1;
13+
14+
--
15+
-- BTREE partial
16+
--
17+
-- UPDATE onek2
18+
-- SET unique1 = onek2.unique1 + 1;
19+
20+
--UPDATE onek2
21+
-- SET unique1 = onek2.unique1 - 1;
22+
23+
--
24+
-- BTREE shutting out non-functional updates
25+
--
26+
-- the following two tests seem to take a long time on some
27+
-- systems. This non-func update stuff needs to be examined
28+
-- more closely. - jolly (2/22/96)
29+
--
30+
/* GPDB TODO: This test is disabled for now, because when running with ORCA,
31+
you get an error:
32+
ERROR: multiple updates to a row by the same query is not allowed
33+
UPDATE tmp
34+
SET stringu1 = reverse_name(onek.stringu1)
35+
FROM onek
36+
WHERE onek.stringu1 = 'JBAAAA' and
37+
onek.stringu1 = tmp.stringu1;
38+
39+
UPDATE tmp
40+
SET stringu1 = reverse_name(onek2.stringu1)
41+
FROM onek2
42+
WHERE onek2.stringu1 = 'JCAAAA' and
43+
onek2.stringu1 = tmp.stringu1;
44+
*/
45+
46+
DROP TABLE tmp;
47+
48+
--UPDATE person*
49+
-- SET age = age + 1;
50+
51+
--UPDATE person*
52+
-- SET age = age + 3
53+
-- WHERE name = 'linda';
54+
55+
--
56+
-- copy
57+
--
58+
COPY onek TO '@abs_builddir@/results/onek.data';
59+
60+
DELETE FROM onek;
61+
62+
COPY onek FROM '@abs_builddir@/results/onek.data';
63+
64+
SELECT unique1 FROM onek WHERE unique1 < 2 ORDER BY unique1;
65+
66+
DELETE FROM onek2;
67+
68+
COPY onek2 FROM '@abs_builddir@/results/onek.data';
69+
70+
SELECT unique1 FROM onek2 WHERE unique1 < 2 ORDER BY unique1;
71+
72+
COPY BINARY stud_emp TO '@abs_builddir@/results/stud_emp.data';
73+
74+
DELETE FROM stud_emp;
75+
76+
COPY BINARY stud_emp FROM '@abs_builddir@/results/stud_emp.data';
77+
78+
SELECT * FROM stud_emp;
79+
80+
-- COPY aggtest FROM stdin;
81+
-- 56 7.8
82+
-- 100 99.097
83+
-- 0 0.09561
84+
-- 42 324.78
85+
-- .
86+
-- COPY aggtest TO stdout;
87+
88+
89+
--
90+
-- inheritance stress test
91+
--
92+
SELECT * FROM a_star*;
93+
94+
SELECT *
95+
FROM b_star* x
96+
WHERE x.b = text 'bumble' or x.a < 3;
97+
98+
SELECT class, a
99+
FROM c_star* x
100+
WHERE x.c ~ text 'hi';
101+
102+
SELECT class, b, c
103+
FROM d_star* x
104+
WHERE x.a < 100;
105+
106+
SELECT class, c FROM e_star* x WHERE x.c NOTNULL;
107+
108+
SELECT * FROM f_star* x WHERE x.c ISNULL;
109+
110+
-- grouping and aggregation on inherited sets have been busted in the past...
111+
112+
SELECT sum(a) FROM a_star*;
113+
114+
SELECT class, sum(a) FROM a_star* GROUP BY class ORDER BY class;
115+
116+
117+
ALTER TABLE f_star RENAME COLUMN f TO ff;
118+
119+
ALTER TABLE e_star* RENAME COLUMN e TO ee;
120+
121+
ALTER TABLE d_star* RENAME COLUMN d TO dd;
122+
123+
ALTER TABLE c_star* RENAME COLUMN c TO cc;
124+
125+
ALTER TABLE b_star* RENAME COLUMN b TO bb;
126+
127+
ALTER TABLE a_star* RENAME COLUMN a TO aa;
128+
129+
SELECT class, aa
130+
FROM a_star* x
131+
WHERE aa ISNULL;
132+
133+
-- As of Postgres 7.1, ALTER implicitly recurses,
134+
-- so this should be same as ALTER a_star*
135+
136+
ALTER TABLE a_star RENAME COLUMN aa TO foo;
137+
138+
SELECT class, foo
139+
FROM a_star* x
140+
WHERE x.foo >= 2;
141+
142+
ALTER TABLE a_star RENAME COLUMN foo TO aa;
143+
144+
SELECT *
145+
from a_star*
146+
WHERE aa < 1000;
147+
148+
ALTER TABLE f_star ADD COLUMN f int4;
149+
150+
UPDATE f_star SET f = 10;
151+
152+
ALTER TABLE e_star* ADD COLUMN e int4;
153+
154+
--UPDATE e_star* SET e = 42;
155+
156+
SELECT * FROM e_star*;
157+
158+
ALTER TABLE a_star* ADD COLUMN a text;
159+
160+
-- That ALTER TABLE should have added TOAST tables.
161+
SELECT relname, reltoastrelid <> 0 AS has_toast_table
162+
FROM pg_class
163+
WHERE oid::regclass IN ('a_star', 'c_star')
164+
ORDER BY 1;
165+
166+
--UPDATE b_star*
167+
-- SET a = text 'gazpacho'
168+
-- WHERE aa > 4;
169+
170+
SELECT class, aa, a FROM a_star*;
171+
172+
173+
--
174+
-- versions
175+
--
176+
177+
--
178+
-- postquel functions
179+
--
180+
--
181+
-- mike does post_hacking,
182+
-- joe and sally play basketball, and
183+
-- everyone else does nothing.
184+
--
185+
SELECT p.name, name(p.hobbies) FROM ONLY person p;
186+
187+
--
188+
-- as above, but jeff also does post_hacking.
189+
--
190+
SELECT p.name, name(p.hobbies) FROM person* p;
191+
192+
--
193+
-- the next two queries demonstrate how functions generate bogus duplicates.
194+
-- this is a "feature" ..
195+
--
196+
SELECT DISTINCT hobbies_r.name, name(hobbies_r.equipment) FROM hobbies_r
197+
ORDER BY 1,2;
198+
199+
SELECT hobbies_r.name, (hobbies_r.equipment).name FROM hobbies_r;
200+
201+
--
202+
-- mike needs advil and peet's coffee,
203+
-- joe and sally need hightops, and
204+
-- everyone else is fine.
205+
--
206+
SELECT p.name, name(p.hobbies), name(equipment(p.hobbies)) FROM ONLY person p;
207+
208+
--
209+
-- as above, but jeff needs advil and peet's coffee as well.
210+
--
211+
SELECT p.name, name(p.hobbies), name(equipment(p.hobbies)) FROM person* p;
212+
213+
--
214+
-- just like the last two, but make sure that the target list fixup and
215+
-- unflattening is being done correctly.
216+
--
217+
SELECT name(equipment(p.hobbies)), p.name, name(p.hobbies) FROM ONLY person p;
218+
219+
SELECT (p.hobbies).equipment.name, p.name, name(p.hobbies) FROM person* p;
220+
221+
SELECT (p.hobbies).equipment.name, name(p.hobbies), p.name FROM ONLY person p;
222+
223+
SELECT name(equipment(p.hobbies)), name(p.hobbies), p.name FROM person* p;
224+
225+
SELECT name(equipment(hobby_construct(text 'skywalking', text 'mer')));
226+
227+
SELECT name(equipment(hobby_construct_named(text 'skywalking', text 'mer')));
228+
229+
SELECT name(equipment_named(hobby_construct_named(text 'skywalking', text 'mer')));
230+
231+
SELECT name(equipment_named_ambiguous_1a(hobby_construct_named(text 'skywalking', text 'mer')));
232+
233+
SELECT name(equipment_named_ambiguous_1b(hobby_construct_named(text 'skywalking', text 'mer')));
234+
235+
SELECT name(equipment_named_ambiguous_1c(hobby_construct_named(text 'skywalking', text 'mer')));
236+
237+
SELECT name(equipment_named_ambiguous_2a(text 'skywalking'));
238+
239+
SELECT name(equipment_named_ambiguous_2b(text 'skywalking'));
240+
241+
SELECT hobbies_by_name('basketball');
242+
243+
SELECT name, overpaid(emp.*) FROM emp;
244+
245+
--
246+
-- Try a few cases with SQL-spec row constructor expressions
247+
--
248+
SELECT * FROM equipment(ROW('skywalking', 'mer'));
249+
250+
SELECT name(equipment(ROW('skywalking', 'mer')));
251+
252+
SELECT *, name(equipment(h.*)) FROM hobbies_r h;
253+
254+
SELECT *, (equipment(CAST((h.*) AS hobbies_r))).name FROM hobbies_r h;
255+
256+
--
257+
-- functional joins
258+
--
259+
260+
--
261+
-- instance rules
262+
--
263+
264+
--
265+
-- rewrite rules
266+
--
267+
268+
269+
--- mdb-related
270+
271+
SELECT mdb_locale_enabled();

0 commit comments

Comments
 (0)