I’ve been beating my head against this absolutely infuriating bug for the last 48 hours, so I thought I’d finally throw in the towel and try asking here before I throw my laptop out the window.
I’m trying to parse the response XML from a call I made to AWS SimpleDB. The response is coming back on the wire just fine; for example, it may look like:
<?xml version="1.0" encoding="utf-8"?>
<ListDomainsResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/">
<ListDomainsResult>
<DomainName>Audio</DomainName>
<DomainName>Course</DomainName>
<DomainName>DocumentContents</DomainName>
<DomainName>LectureSet</DomainName>
<DomainName>MetaData</DomainName>
<DomainName>Professors</DomainName>
<DomainName>Tag</DomainName>
</ListDomainsResult>
<ResponseMetadata>
<RequestId>42330b4a-e134-6aec-e62a-5869ac2b4575</RequestId>
<BoxUsage>0.0000071759</BoxUsage>
</ResponseMetadata>
</ListDomainsResponse>
I pass in this XML to a parser with
XMLEventReader eventReader = xmlInputFactory.createXMLEventReader(response.getContent());
and call eventReader.nextEvent();
a bunch of times to get the data I want.
Here’s the bizarre part — it works great inside the local server. The response comes in, I parse it, everyone’s happy. The problem is that when I deploy the code to Google App Engine, the outgoing request still works, and the response XML seems 100% identical and correct to me, but the response fails to parse with the following exception:
com.amazonaws.http.HttpClient handleResponse: Unable to unmarshall response (ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.): <?xml version="1.0" encoding="utf-8"?>
<ListDomainsResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/"><ListDomainsResult><DomainName>Audio</DomainName><DomainName>Course</DomainName><DomainName>DocumentContents</DomainName><DomainName>LectureSet</DomainName><DomainName>MetaData</DomainName><DomainName>Professors</DomainName><DomainName>Tag</DomainName></ListDomainsResult><ResponseMetadata><RequestId>42330b4a-e134-6aec-e62a-5869ac2b4575</RequestId><BoxUsage>0.0000071759</BoxUsage></ResponseMetadata></ListDomainsResponse>
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(Unknown Source)
at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(Unknown Source)
at com.amazonaws.transform.StaxUnmarshallerContext.nextEvent(StaxUnmarshallerContext.java:153)
... (rest of lines omitted)
I have double, triple, quadruple checked this XML for ‘invisible characters’ or non-UTF8 encoded characters, etc. I looked at it byte-by-byte in an array for byte-order-marks or something of that nature. Nothing; it passes every validation test I could throw at it. Even stranger, it happens if I use a Saxon-based parser as well — but ONLY on GAE, it always works fine in my local environment.
It makes it very hard to trace the code for problems when I can only run the debugger on an environment that works perfectly (I haven’t found any good way to remotely debug on GAE). Nevertheless, using the primitive means I have, I’ve tried a million approaches including:
- XML with and without the prolog
- With and without newlines
- With and without the «encoding=» attribute in the prolog
- Both newline styles
- With and without the chunking information present in the HTTP stream
And I’ve tried most of these in multiple combinations where it made sense they would interact — nothing! I’m at my wit’s end. Has anyone seen an issue like this before that can hopefully shed some light on it?
Thanks!
If you are working with XML files, you may have encountered an error message that says «Content is not allowed in prolog». This error can be frustrating, but it is usually easy to fix. In this guide, we will walk you through the steps to resolve this error.
What causes the ‘Content Is Not Allowed in Prolog’ Error?
The ‘Content is not allowed in prolog’ error message appears when there is invalid content before the XML declaration in the file. The XML declaration is the first line of an XML file and typically contains the version and encoding information.
There are several reasons why invalid content may appear before the XML declaration, such as:
- A hidden character or whitespace before the XML declaration
- Malformed HTML tags or comments
- Invalid encoding
- Incorrect file type
How to Fix the ‘Content Is Not Allowed in Prolog’ Error
Follow the steps below to fix the ‘Content is not allowed in prolog’ error for XML files:
- Open the XML file in a text editor, such as Notepad or Sublime Text.
- Look for any invalid content before the XML declaration, such as hidden characters or whitespace. Delete any invalid content and save the file.
- Check for malformed HTML tags or comments. Fix any errors and save the file.
- Check the encoding of the file. The encoding should match the encoding declared in the XML declaration. If the encoding is incorrect, change it and save the file.
- Ensure that the file is saved as an XML file. If the file has a different extension, such as .txt, rename it to have the .xml extension.
FAQ
1. How do I open an XML file in a text editor?
To open an XML file in a text editor, right-click on the file and select ‘Open with’ and choose a text editor program, such as Notepad or Sublime Text.
2. What are hidden characters?
Hidden characters are characters that are not visible in the text editor but can affect the formatting of the document. Examples of hidden characters include spaces, tabs, and line breaks.
3. How do I check the encoding of an XML file?
You can check the encoding of an XML file by looking at the XML declaration at the beginning of the file. The encoding is specified in the ‘encoding’ attribute.
4. Can I use an XML file with a .txt extension?
Technically, you can use an XML file with any extension. However, it is best practice to use the .xml extension to avoid confusion and ensure that the file is recognized as an XML file.
5. What should I do if the error message persists after trying the steps above?
If the error message persists after trying the steps above, it may be caused by a more complex issue. In this case, it may be helpful to seek assistance from a developer or technical support team.
Conclusion
The ‘Content is not allowed in prolog’ error message can be frustrating, but it is usually easy to fix by following the steps above. By checking for hidden characters, malformed HTML tags, and correct encoding, you can ensure that your XML files are error-free and ready for use.
If you have any additional questions or concerns, feel free to consult the FAQ section or seek assistance from a technical expert.
- Common XML Errors and How to Fix Them
- How to Fix “Content is not allowed in prolog” Error in XML Files
Мы используем синтаксический анализатор SAX для анализа XML-файла и получаем следующее сообщение об ошибке:
org.xml.sax.SAXParseException; systemId: ../src/main/resources/staff.xml; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
Короче говоря, недопустимый текст или спецификация перед объявлением XML или другой кодировкой вызовут ошибку SAX – Содержимое не разрешено в прологе
.
- 1. Недопустимый текст перед объявлением XML.
- 2. Спецификация в начале XML-файла.
- 3. Другой формат кодирования
- 4. Скачать Исходный Код
- 5. Рекомендации
1. Недопустимый текст перед объявлением XML.
В начале XML-объявления любой текст вызовет Содержимое не разрешено в прологе
ошибка.
Например, приведенный ниже XML-файл содержит дополнительную маленькую точку .
перед объявлением XML.
. yong mook kim mkyong 100000
Чтобы исправить это Удалите любой текст перед объявлением XML.
yong mook kim mkyong 100000
2. Спецификация в начале XML-файла.
Многие текстовые редакторы автоматически добавляют спецификацию в файл UTF-8.
Примечание Прочитайте следующие статьи:
- Java добавляет и удаляет спецификацию из файла UTF-8
- Википедия – Метка порядка байтов (спецификация)
Протестированный с Java 11 и Java 8, встроенный синтаксический анализатор SAX может правильно анализировать файл спецификации UTF-8; однако некоторые разработчики утверждали, что спецификация вызвала ошибку при анализе XML.
Чтобы исправить это , удалите спецификацию из файла UTF-8.
- Удалите спецификацию с помощью кода
- В notepad++ проверьте кодировку
UTF-8 без спецификации
. - В Intellij IDEA прямо в файле выберите
Удалить спецификацию
.
P.S Многие редакторы текста или кода имеют функции для добавления или удаления метка порядка байтов (спецификация) для файла попробуйте найти нужную функцию в меню.
3. Другой формат кодирования
Различная кодировка также вызвала популярный XML Содержимое не допускается в прологе.
Например, XML-файл UTF-8.
mkyong support 5000 yflow admin 8000
И мы используем кодировку UTF-16 для анализа вышеупомянутого XML-файла в кодировке UTF-8.
SAXParserFactory factory = SAXParserFactory.newInstance(); try (InputStream is = getXMLFileAsStream()) { SAXParser saxParser = factory.newSAXParser(); // parse XML and map to object, it works, but not recommend, try JAXB MapStaffObjectHandlerSax handler = new MapStaffObjectHandlerSax(); // more options for configuration XMLReader xmlReader = saxParser.getXMLReader(); xmlReader.setContentHandler(handler); InputSource source = new InputSource(is); // UTF-16 to parse an UTF-8 XML file source.setEncoding(StandardCharsets.UTF_16.toString()); xmlReader.parse(source); // print all List result = handler.getResult(); result.forEach(System.out::println); } catch (ParserConfigurationException | SAXException | IOException e) { e.printStackTrace(); }
Выход
[Fatal Error] :1:1: Content is not allowed in prolog. org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1243) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) at com.mkyong.xml.sax.ReadXmlSaxParser2.main(ReadXmlSaxParser2.java:45)
4. Скачать Исходный Код
$клон git $клон git
$компакт-диск java-xml
$cd src/основной/java/com/mkyong/xml/саксофон/
5. Рекомендации
- Синтаксический анализатор Java SAX
- Java добавляет и удаляет спецификацию из файла UTF-8
- Википедия – Метка порядка байтов (спецификация)
- В чем разница между UTF-8 и UTF-8 без спецификации?
Оригинал: “https://mkyong.com/java/sax-error-content-is-not-allowed-in-prolog/”
-
Метки
bom, content, error, text, xml
For example in Problems I have
Content is not allowed in prolog.
How to find out why its there(what eclipse plugin has put it there) and how to turn it off?
asked Mar 2, 2011 at 11:00
10
I had this problem too and it was, because i changed/saved the file in UltraEdit. After the save command, the file encoding changed and included characters, eclipse was not able to read.
You can open the file with the windows «Editor» tool and delete the characters, eclipse can not read. You will directly detect them.
answered Mar 2, 2011 at 11:25
Markus LausbergMarkus Lausberg
12.2k6 gold badges40 silver badges66 bronze badges
This sounds like an error with a xml file. Most of the time «Content is not allowed in prolog» means, that your XML file does not have the right format or even doesn’t start the right way.
answered Mar 2, 2011 at 11:04
ChrisChris
7,6758 gold badges51 silver badges101 bronze badges
1
«Content is not allowed in prolog» is the error thrown by Xerces when there’s something in an XML file or stream that precedes the <?xml?>
declaration. There must be nothing before that, not even whitespace or a Byte-Order-Mark.
answered Mar 2, 2011 at 11:10
skaffmanskaffman
399k96 gold badges819 silver badges770 bronze badges
1
Double-click the message and it should take you to the file (and ideally location within the file) that is the source of the problem.
This specific error sounds like you’ve got a malformed XML file.
answered Mar 2, 2011 at 11:02
Joachim SauerJoachim Sauer
303k57 gold badges556 silver badges614 bronze badges
1
The «Fatal Error :1:1: Content is not allowed in prolog» is a common error encountered while working with Java and XML. This error message indicates that there is an issue with the XML file and the XML parser is unable to parse the document correctly. The problem could be due to a variety of reasons, including invalid characters or syntax issues within the XML file. If you’re encountering this error, don’t worry, there are several methods you can try to resolve it.
To fix the «Java: how to fix Fatal Error :1:1: Content is not allowed in prolog?» error using the «Remove Extra White Spaces» method, you can follow these steps:
- Open the XML file that is causing the error in a text editor.
- Look for any extra white spaces before the XML declaration (
<?xml version="1.0" encoding="UTF-8"?>
). These extra white spaces can cause the «Content is not allowed in prolog» error. - Remove any extra white spaces before the XML declaration.
- Save the file and try to run your Java program again.
Here’s an example of what the XML file should look like after removing extra white spaces:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element>data</element>
</root>
And here’s an example of what the XML file should NOT look like:
<-- Extra white spaces before the XML declaration -->
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element>data</element>
</root>
Note that removing extra white spaces before the XML declaration is just one possible solution to the «Content is not allowed in prolog» error. There may be other causes for this error, and other solutions may be necessary.
Method 2: Remove BOM Character
To fix the «Fatal Error :1:1: Content is not allowed in prolog» issue in Java, we can remove the BOM (Byte Order Mark) character from the XML file. Here is how to do it:
- Read the XML file and store it in a string variable.
String xml = new String(Files.readAllBytes(Paths.get("file.xml")), StandardCharsets.UTF_8);
- Check if the XML file contains the BOM character at the beginning.
if (xml.startsWith("\uFEFF")) {
xml = xml.substring(1);
}
- Write the modified XML string back to the file.
Files.write(Paths.get("file.xml"), xml.getBytes(StandardCharsets.UTF_8));
Here is the complete code:
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
public class RemoveBOM {
public static void main(String[] args) throws Exception {
String xml = new String(Files.readAllBytes(Paths.get("file.xml")), StandardCharsets.UTF_8);
if (xml.startsWith("\uFEFF")) {
xml = xml.substring(1);
}
Files.write(Paths.get("file.xml"), xml.getBytes(StandardCharsets.UTF_8));
}
}
This code reads the XML file using the Files.readAllBytes
method, which returns a byte array. We convert the byte array to a string using the UTF-8 charset.
Next, we check if the XML string starts with the BOM character (\uFEFF
). If it does, we remove it using the substring
method.
Finally, we write the modified XML string back to the file using the Files.write
method.
By removing the BOM character from the XML file, we can fix the «Fatal Error :1:1: Content is not allowed in prolog» issue in Java.
Method 3: Fix Incorrect XML Syntax
To fix the «Fatal Error :1:1: Content is not allowed in prolog» in Java using the «Fix Incorrect XML Syntax» method, you can follow the steps below:
-
Check the XML file for any syntax errors, such as missing closing tags or invalid characters at the beginning of the file.
-
Use a text editor or an XML editor to correct any syntax errors in the XML file.
-
If the error persists, you can try adding the following code to your Java program to fix the error:
FileInputStream file = new FileInputStream("file.xml");
InputStreamReader reader = new InputStreamReader(file, "UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(is);
This code sets the encoding to UTF-8 and parses the XML file using the DocumentBuilder class.
- You can also try using the following code to remove any invalid characters from the XML file:
String xml = "file.xml";
BufferedReader br = new BufferedReader(new FileReader(xml));
String line;
StringBuilder sb = new StringBuilder();
while ((line = br.readLine()) != null) {
sb.append(line.replaceAll("[^\\x20-\\x7e]", ""));
}
String cleanXml = sb.toString();
This code reads the XML file line by line and removes any characters that are not in the ASCII range of 0x20 to 0x7e.
By following these steps, you should be able to fix the «Fatal Error :1:1: Content is not allowed in prolog» error in Java using the «Fix Incorrect XML Syntax» method.
Method 4: Check for invalid characters
To fix the «Fatal Error :1:1: Content is not allowed in prolog» issue in Java, you can use the «Check for invalid characters» method. This issue occurs when the XML file contains invalid characters at the beginning of the file, which causes the parser to fail.
To check for invalid characters, you can use the following code:
public static void main(String[] args) throws Exception {
File file = new File("file.xml");
FileInputStream fis = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader br = new BufferedReader(isr);
String line;
while ((line = br.readLine()) != null) {
if (line.length() > 0 && line.charAt(0) != '<') {
line = "<" + line;
}
System.out.println(line);
}
br.close();
}
This code reads the XML file line by line and checks if the first character of each line is a «<» character. If it is not, it adds the «<» character to the beginning of the line. This ensures that the XML file starts with a valid XML tag.
You can also use the following code to remove any invalid characters from the XML file:
public static void main(String[] args) throws Exception {
File file = new File("file.xml");
FileInputStream fis = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
BufferedReader br = new BufferedReader(isr);
String line;
while ((line = br.readLine()) != null) {
line = line.replaceAll("[^\\x20-\\x7e]", "");
System.out.println(line);
}
br.close();
}
This code reads the XML file line by line and removes any characters that are not in the range of ASCII characters 0x20-0x7e. This ensures that the XML file only contains valid characters.
In summary, to fix the «Fatal Error :1:1: Content is not allowed in prolog» issue in Java, you can use the «Check for invalid characters» method. This involves checking the XML file for invalid characters or removing any invalid characters from the file. The code examples provided demonstrate how to implement these methods in Java.